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METHODS TO IDENTIFY COMPOUNDS FOR 
DISRUPTING PROTEIN/PROTEIN INTERACTIONS 

Background of t he Invention 

The present invention relates to a novel method to identify 
5 inhibitors of protein/protein interactions. 

Background 

Modulation of protein/protein interactions is an attractive target 
for drug discovery and development. Potential methods by which drugs can 
regulate protein/protein interactions are numerous, including, for example, 

0 regulation of expression of one or more of the binding proteins, modulation 
of post-translational modification, and direct interference with the capacity of 
one protein to bind to one or more binding partners. More importantly, 
recent observations make it increasingly clear that supramolecular protein 
complexes, involving two or more binding proteins, play an important and 

5 essential roles in signal transduction, gene expression, cell proliferation and 
duplication, and cell cycle progression. For example, in the repair of UV 
damaged DNA, a so-called "repairsome" that contains over ten individual 
proteins is assembled into a complex which can then carry out the necessary 
repair. Likewise, gene transcription occurs through the concerted action of 

) greater than twenty proteins. Signal transduction proteins, such as receptor 
protein kinases, are part of large complexes with many proteins. Contacts 
through Src homology type 2 (SH2) domains on the receptor kinases, for 
example, are noteworthy protein interaction which are part of one or more 
enzymatic cascade important for many metabolic processes. Disrupting the 
binding capacity of one or more proteins which form any of these larger 
complex is therefore an important and untapped method to control action of 
the overall complex. 

Protein/protein interactions have been discovered and 
characterized by a variety of methods: (i) standara 1 biochemical affinity 
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methods such as chromatography or co-immunoprecipitations; (ii) gel overlay 
methods; (iii) co-purification by traditional biochemistry; and (iv) two-hybrid 
analysis [Fields and Song, Nature 340:245-246 (1989); Fields, Methods: A 
Companion to Methods in Enzymology 5: 1 16-124 (1993); U.S. Patent 5,283, 
5 173 issued February 1, 1994 to Fields, et al.]. The most recent of these 
approaches, the two hybrid method, has enjoyed broad application because of 
its relative ease of use for gene identification from cDNA fusion libraries. 
[See Chien etal., Proc. NatL Acad. Sci. (USA) 88:9578-9582 (1991); Dalton 
and Treisman, Cell 72:223-232 (1993); and Durfee, etal. Genes and DeveL 

10 7:555-569 (1993)]. 

The two hybrid system is based on targeting and identifying a 
protein/protein interaction through the use of a reporter system. The 
described two hybrid systems either use the yeast Gal4 DNA binding domain 
or the E. coli lexA DNA binding domain and couple this region to a 

15 transcriptional activator such as Gal4 or VP 16 that drives a reporter like /3 
galactosidase or HIS3. 

In principle the two hybrid assay could be used for drug 
screening. [See WO 96/03501 and WO 96/03499.] In such a scenario, loss 
of /? galactosidase or HIS3 activity would be identified after the yeast sixain 

20 is treated with a compound. In practice, however, use of the two hybrid 
system is technically undesirable for several reasons. In instances where the 
0 galactosidase or HIS3 protein arc employed as the reporter protein, a loss 
of activity is particularly difficult to detect because the expressed reporter 
protein is too long lived to be used in a high throughput mode. If a candidate 

25 binding inhibitor compound is metabolized faster than the previously expressed 
reporter protein is turned over, it is difficult to detect inhibitory action of the 
candidate drug while a reporter protein is still active. In high throughput 
screening, the loss of a positive signal, for example, /? galactosidase or HIS3 
is impossible to detect. Present robotocized screening and detection methods 

30 are simply not sufficiently sensitive or robust to detect loss of a signal. 
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Thus there is a need in the art to develop a rapid screening 
method that gives a positive signal, as opposed to a negative signal, when a 
protein/protein interaction is disrupted. Such a system must be capable of 
using protein interactions that are initially detected by any of the above 

5 mentioned approaches and must be sufficiently robust to detect a gain of 
function when a protein interaction is lost. In essence, the screening method 
must give a signal when an interaction is lost, not lose a signal when an 
interaction is lost. Such a system must be sensitive to subtle interactions, in 
particular ones that arc caused by post-translational modification like protein 

10 phosphorylation. Finally for large scale screening, such as high throughput 
screening, the system must be manipulate such that a large signal-to-noise 
ration can be easily detected. 



Brief Summary of the Invention 

In one aspect, the present invention provides materials that are 
15 useful for the identification of compounds which inhibit interaction between 
known binding partner proteins. See Figure 1 . The invention provides host 
cells transformed or transfected with DNA comprising: (i) a repressor gene 
encoding DNA binding protein that acts as a repressor protein, said repressor 
gene under transcriptional control of a promoter; (ii) a selectable marker gene 
20 encoding a selectable^ marker protein; said selectable marker gene under 
transcriptional control of an operator; said operator regulated by interaction 
with said repressor protein; (iii) a first recombinant fusion protein gene 
encoding a first binding protein or binding fragment thereof in frame with 
either a DNA binding domain of a transcriptional activating protein or a 
25 transactivating domain of a transcriptional activating protein; and (iv) a second 
recombinant fusion protein gene encoding a second binding protein or binding 
fragment thereof in frame with either a DNA binding domain of a 
transcriptional activating protein or a transactivating domain of a 
transcriptional activating protein, whichever domain is not encoded by the first 



WO 98/13502 



PCT/US97/17276 



- 4 - 

fusion protein gene, said second binding protein or binding fragment thereof 
capable of interacting with said first binding protein or binding fragment 
thereof such that interaction of said second binding protein or binding 
fragment thereof and said first binding protein or binding fragment thereof 
5 brings into proximity a DNA binding domain and a transactivating domain 
forming a functional transcriptional activating protein; said functional 
transcriptional activating protein acting on said promoter to increase 
expression of said repressor gene. 

The invention comprehends host cells wherein the various genes 

10 and regulatory sequences are encoded on a single DNA molecule as well as 
host cells wherein one or more of the repressor gene, the selectable marker 
gene, the first recombinant fusion protein gene, and the second recombinant 
fusion protein gene are encoded on distinct DNA expression constructs In 
a preferred embodiment, the host cells are transformed or transfected with 

15 DNA encoding the repressor gene, the selectable marker gene, the first 
recombinant fusion protein gene, and the second recombinant fusion protein 
gene, each encoded on a distinct expression construct. Regardless of the 
number of DNA expression constructs introduced, each transformed or 
transfected DNA expression construct further comprises a selectable marker 

20 gene sequence, the expression of which is used to confirm that transfection or 
transformation was, in fact, accomplished. Selectable marker genes encoded 
on individually transformed or transfected DNA expression constructs are 
distinguishable from the selectable marker under transcriptional regulation of 
the tet operator in that expression of the selectable marker gene regulated by 

25 the tet operator is central to the preferred embodiment; i.e., regulated 
expression of the selectable marker gene by the tet operator provides a 
measurable phenotypic change in the host cell that is used to identify a binding 
protein inhibitor. Selectable marker genes encoded on individually 
transformed or transfected DNA expression constructs are provided as 

30 determinants of successful transfection or transformation of the individual 
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DNA expression constructs. Preferred host cells of the invention include 
transformed S. cerevisiae strains designated YI596 and YI584 which were 
deposited August 13, 1996 with the American Type Culture Collection 
(ATCC), 12301 Parklawn Drive, Rockville, Maryland 20852, and assigned 
5 Accession Numbers ATCC 74384 and ATCC 74385, respectively. 

The host cells of the invention include any cell type capable of 
expressing the heterologous proteins required as described above and which 
are capable of being transformed or transfected with functional promoter and 
operator sequences which regulate expression of the heterologous proteins also 
1 0 as described. In a preferred embodiment, the host cells are of either mammal, 
insect or yeast origin. Presently, the most preferred host cell is a yeast cell. 
The preferred yeast cells of the invention can be selected from various strains, 
including the S. cerevisiae yeast transformants described in Table 1. 
Alternative yeast specimens include S.pombe, K.lactis, P.pastoris, 
S.carlsbergensis and C.albicans. Preferred mammalian host cells of the 
invention include Chinese hamster ovary (CHO), COS, HeLa, 3T3, CV1, 
LTK, 293T3, Rati, PC12 or any other transferable cell line of human or 
rodent origin. Preferred insect cells lines include SF9 cells. 

In a preferred embodiment, the selectable marker gene is 
regulated by an operator and encodes an enzyme in a pathway for synthesis 
of a nutritional requirement for said host cell such that expression of said 
selectable marker protein is required for growth of said host cell on media 
lacking said nutritional requirement. Thus, as in a preferred embodiment 
where a repressor protein interacts with the operator, transcription of the 
selectable marker gene is down-regulated and the host cells are identified by 
an inability to grow on media lacking the nutritional requirement and an 
ability to grow on media containing the nutritional requirement. In a most 
preferred embodiment, the selectable marker gene encodes the HIS3 protein, 
and host cells transformed or transfected with a HIS3-encoding DNA 
expression construct are selected following growth on media in the presence 
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and absence of histidinc. The invention, however, comprehends any of a 
number of alternative selectable marker genes regulated by an operator. Gene 
alternatives include, for example URA3, LEU2, LYS2 or those encoding any 
of the multitude of enzymes required in various pathways for production of 

5 a nutritional requirement which can be definitively excluded from the media 
of growth. In addition, conventional reporter genes such as chloramphenicol 
acetyltransferase (CAT), firefly luciferase, 0-galactosidase (0-gaI), secreted 
alkaline phosphatase (SEAP), green fluorescent protein (GFP), human growth 
hormone (hGH), /3-glucuronidase, neomycin, hygromycin, thymidine kinase 

10 (TK) and the like may be utilized in the invention. 

In the preferred embodiment, the host cells include a repressor 
protein gene encoding the tetracycline resistance protein which acts on the tet 
operator to decrease expression of the selectable marker gene. The invention, 
however, also encompasses alternatives to the tet repressor and operator, for 

15 example, E. coli trp repressor and operator, his repressor and operator, and lac 
operon repressor and operator. 

The DNA binding domain and transactivating domain 
components of the fusion protein may be derived from the same transcription 
factor or from different transcription factors as long as bringing the two 

20 domains into proximity permits formation of a functional transcriptional 
activity protein that increases expression of the repressor protein with high 
efficiency. A high efficiency transcriptional activating protein is defined as 
having both a DNA binding domain exhibiting high affinity binding for the 
recognized promoter sequence and a transactivating domain having high 

25 affinity binding for transcriptional machinery proteins required to express 
repressor gene mRNA. The DNA binding domain component of a fusion 
protein of the invention can be derived from any of a number of different 
proteins including, for example, LexA or Gal4. Similarly, the transactivating 
component of the invention's fusion proteins can be derived from a number 

30 of different transcriptional activating proteins, including for example, Gal4 or 
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VP 16. In one embodiment of the invention, polynucleotides encoding 
binding partner proteins CREB and CBD are inserted in plasmids pVP16- 
CREB and pLexA-CBD, respectively, which were deposited with the ATCC 
and assigned Accession Numbers ATCC 98138 and ATCC 98139, 
5 respectively. 

The promoter sequence of the invention which regulates 
transcription of the repressor protein can be any sequence capable of driving 
transcription in the chosen host cell. The promoter may be a DNA sequence 
specifically recognized by the chosen DNA binding domain of the invention, 

10 or any other DNA sequence with which the DNA binding domain of the 
fusion protein is capable of high affinity interaction. In a preferred 
embodiment of the invention, the promoter sequence of the invention is either 
a fflS3 or alcohol dehydrogenase (ADH) promoter. In a presently most 
preferred embodiment, the ADH pilomotor is employed in the invention. The 

15 invention however, encompasses numerous alternative promoters, including, 
for example, those derived from genes encoding fflS3, ADH, URA3, LEU2 
and the like. 

In another aspect, the invention provides methods to identify 
molecules that inhibit interaction between known binding partner proteins. In 

20 one embodiment, the invention provides a method to identify an inhibitor of 
binding between a first binding protein or binding fragment thereof and a 
second binding protein or binding fragment thereof comprising the steps of (a) 
growing host cells transformed or transfected as described above in the 
absence of a test compound and under conditions which permit expression of 

25 said first binding protein or binding fragment thereof and said second binding 
protein or binding fragment thereof such that said first binding protein or 
fragment thereof and second binding protein or binding fragment thereof 
interact bringing into proximity said DNA binding domain and said 
transactivating domain forming a functional transcriptional activating protein; 

30 the transcriptional activating protein acting on said promoter to increase 
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expression of said repressor protein; said repressor protein interacting with 
said operator such that said selectable marker protein is not expressed; (b) 
confirming lack of expression of said selectable marker protein in said host 
cell; (c) growing said host cells in the presence of a test compound; and (d) 
5 comparing expression of said selectable marker protein in the presence and 
absence of said test compound wherein increased expression of said selectable 
marker protein is indicative that the test compound is an inhibitor of binding 
between said first binding protein or binding fragment thereof and said second 
binding protein or binding fragment thereof. 

10 In a most preferred embodiment, the invention provides a 

method to identify an inhibitor of binding between a first binding protein or 
binding fragment thereof and a second binding protein or binding fragment 
thereof comprising the steps of: (a) transforming or transfecting a host cell 
with a first DNA expression construct comprising a first selectable maTker 

15 gene encoding a first selectable marker protein and a repressor gene encoding 
a repressor protein, said repressor gene under transcriptional control of a 
promoter; (b) transforming or transfecting said host cell with a second DNA 
expression construct comprising a second selectable marker gene encoding a 
second selectable marker protein and a third selectable marker gene encoding 

20 a third selectable marker protein, said third selectable marker gene under 
transcriptional control of an operator, said operator specifically acted upon by 
said repressor protein such that interaction of said repressor protein with said 
operator decreases expression of said third selectable marker protein; (c) 
transforming or transfecting said host cell with a third DNA expression 

25 construct comprising a fourth selectable marker gene encoding a fourth 
selectable marker protein and a first fusion protein gene encoding a first 
binding protein or binding fragment thereof in frame with either a DNA 
binding domain of a transcriptional activation protein or a transactivating 
domain of said transcriptional activation protein; (d) transforming or 

30 transfecting said host cell with a fourth DNA expression construct comprising 
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a fifth selectable marker gene encoding a fifth selectable marker protein and 
a second fusion protein gene encoding a second binding protein or binding 
fragment thereof in frame with either the DNA binding domain of said 
transcriptional activation protein or the transactivating domain of said 
5 transcriptional activation protein, whichever is not included in first fusion 
protein gene; (e) growing said host cell under conditions which permit 
expression of said first binding protein or fragment thereof and said second 
binding protein or fragment thereof such that said first binding protein or 
fragment thereof and second binding protein or binding fragment thereof 

10 interact bringing into proximity said DNA binding domain and said 
transactivating domain reconstituting said transcriptional activating protein; 
said transcriptional activating protein acting on said promoter to increase 
expression of said repressor protein; said repressor protein interacting with 
said operator such that said third selectable marker protein is not expressed; 

15 (0 detecting absence of expression of said selectable gene; (g) growing said 
host cell in the presence of a test compound of binding between said first 
protein or fragment thereof and said second binding protein or fragment 
thereof; and (h) comparing expression of said selectable marker protein in the 
presence and absence of said test compound wherein decreased expression of 

20 said selectable marker protein is indicative of an ability of the test compound 
to inhibit binding between said first binding protein or binding fragment 
thereof and said second binding protein or binding fragment thereof such that 
said transcriptional activating protein is not reconstituted, expression of said 
repressor protein is not increased, and said operator increases expression of 

25 said selectable marker protein. 

The methods of the invention encompass any and all of the 
variations in host cells as described above. In particular, the invention 
encompasses a method wherein: the host cell is a yeast cell; the selectable 
marker gene encodes HIS3; transcription of the selectable marker gene is 

30 regulated by the ret operator; the repressor protein gene encodes the 
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tetracycline resistance protein; transcription of the tetracycline resistance 
protein is regulated by the HIS3 promoter; the DNA binding domain is 
derived from LexA; and the transactivating domain is derived from VP16. 
In another embodiment, the invention encompasses a method wherein: the host 
5 cell is a yeast cell; the selectable marker gene encodes HIS3; transcription of 
the selectable marker gene is regulated by the let operator; the repressor 
protein gene encodes the tetracycline resistance protein; transcription of the 
tetracycline resistance protein is regulated by the alcohol dehydrogenase 
promoter; the DNA binding domain is derived from LexA; and the 

10 transactivating domain is derived from VP16. 

In alternative embodiments of the invention wherein the host 
cell is a mammalian cell, variations include the use of mammalian DNA 
expression constructs to encode the first and second recombinant fusion genes, 
the repressor gene, and the selectable marker gene, and use of selectable 

15 marker genes encoding antibiotic or drug resistance markers (i.e., neomycin, 
hygromycin, thymidine kinase). 

There are at least three different types of libraries used for the 
identification of small molecule modulators. These include: (1) chemical 
libraries, (2) natural product libraries, and (3) combinatorial libraries 

20 comprised of random peptides, oligonucleotides or organic molecules. 

Chemical libraries consist of structural analogs of known 
compounds or compounds that are identified as "hits" via natural product 
screening. Natural product libraries are collections of microorganisms, 
animals plants or marine organisms which are used to create mixtures for 

25 screening by: (1) fermentation and extraction of broths from soil, plant or 
marine microorganisms or (2) extraction of plants or marine organisms. 
Combinatorial libraries are composed of large numbers of peptides, 
oligonucleotides or organic compounds as a mixture. They are relatively easy 
to prepare by traditional automated synthesis methods, PCR, cloning or 

30 proprietary synthetic methods. Of particular interest are peptide and 
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oligonucleotide combinatorial libraries. Still other libraries of interest include 
peptide, protein, peptidomimetic, multiparallel synthetic collection, 
recoinbinatorial, polypeptide libraries. 

The utility of the various aspects of the invention is manifest. 
5 Host cells of the invention are useful to demonstrate in vivo binding capacity 
of both known and suspected binding partner proteins in a recombinant 
system. Such an expression system permits systematic analysis of the 
structure and function of a particular binding protein, thus permitting 
identification and/or synthesis of potential modulators of the physiological 

10 activity of the binding proteins. The methods of the invention are particularly 
useful to identify and improve molecules which are capable of inhibiting 
specific and general protein/protein interactions. Inhibitors identified by the 
methods of the invention can then be examined for utility in vivo as 
therapeutic and/or prophylactic medicaments for conditions associated with 

15 various protein/protein interactions. 

Description of the Drawing 

Figure 1 describes the mechanics of the split hybrid assays. 

Detailed Descrip ti on of the Invention 

The present invention relates generally to methods designated 
20 split hybrid assays to identify inhibitors of protein/protein interactions and is 
illustrated by the following examples describing various methods for making 
and using the invention. In particular, Example 1 relates to construction of 
various plasmids and expression constructs utilized in the invention. Example 
2 described generation of various yeast transformants used to identify inhibitor 
25 compounds. Examples 3, 4, 5 and 6 address use of the split hybrid assay to 
examine CREB/CBD binding, Tax/SRF binding, CKI/CREB binding and 
AKAP 79 binding to various partner protein, respectively. Example 7 
describe general application of the split hybrid assay. Example 8 relates to 
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use of the split hybrid assay for weakly interacting binding partners. Example 
9 describes general assay methods. Example 10 addresses use of the split 
hybrids assay to identify agents that prevent receptor desensitization and drug 
tachyphylaxis. 



S Example 1 

Plasmid Construction 

In the examples that follow, various plasmid constructs were 

utilized as described. To simplify discussion of the exemplified assays, this 

example describes construction of the various plasmids used in the following 

10 examples. For clarity, the plasmids are grouped according common features 

relating to their applications in the assays later discussed. 

L Plasmids Encoding Reporter Gene HIS3 

A. pRS303/lxtetop-MluI 

One copy of the let operator sequence was engineered into 
15 position -53 in the HIS3 promoter of pRS313 [Sikorski, R.S. et ai, Genetics 
122:19 27 (1989)] by using the polymerase chain reaction (PCR). Two 
primary PCR reactions using pRS313 as a template were performed which 
utilized a 5'-terminal oligonucleotide designated Eco47IEkV and a 3'-inner 
oligonucleotide designated Tetop internal 3' to yield a primary 5'-PCR product 
20 and a 5'-inner oligonucleotide designated Tetop internal 5 ' and a 3'-terminal 
oligonucleotide designated Nhe I 3' to yield a primary 3'-PCR product. 

Eco47 m-5' SEQ ID NO: 1 

5 '-TTGGTGAGCGCTAGGAGTCACTGCCAG 

Tetop int. 3' SEQ ID NO: 2 

25 5 ' -TATACTCTATC AATG ATAG AGTA ATTC ATTATGTG ATAATGCC 

Tetop int. 5' SEQ ID NO: 3 

5 '-ATTACTCTATC ATTG ATAG AGTATATAAAGTAATGTG ATTTC) 

Nhe I 3' SEQ ID NO: 4 
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5 ' - AATTCTGCTAGCCTCTGC AAAGC 

5' and 3' inner oligonucleotides contain complementary sequence such that 3' 
sequence of the primary 5' PCR product overlaps with 5' sequence of the 
primary 3' PCR product. The 5' terminal oligonucleotide contains the 
5 restriction site Eco47Ul while the 3' terminal oligonucleotide contains the 
restriction she Nhel in order to facilitate subsequent subcloning. The primary 
PCR reactions were performed with PJu DNA polymerase (Stratagene, La 
Jolla, CA) using reaction conditions described by the manufacturer. PCR 
products were isolated by Biol 01 (Vista. CA) Gene Clean m gel extraction. 

10 The primary 5' and 3' PCR products were then combined in a second PCR 
reaction and amplified using the 5'- and 3'- terminal oligonucleotides, 
Eco47III-5' and Nhe 13'. The second PCR reaction was performed with Vent 
DNA polymerase (New England Biolabs, Beverly, MA) using reaction 
conditions described by the manufacturer, except that the reactions were 

15 supplemented with 4 mM Mg 2 + . The Final PCR product contained one tet 
operator sequence inserted into position -53 of the HIS3 promoter and 
nucleotides 52-48 deleted in the construction. The final PCR product was 
isolated, digested with EcoAlUl and Nhel and cloned into pRS313 previously 
digested with EcoATUl and Nhel. The resulting plasmid was designated 

20 pRS3 1 3/ 1 xtetop. DNA sequencing confinned the presence of one copy of the 
tet operator sequence in pRS313/lxtetop and confirmed integrity of the 
£co47in and Nhel junctions. 

A Mlul restriction enzyme site was engineered into position -22 
in the HIS3 promoter of pRS313/lxtetop by utilizing PCR using Vent DNA 

25 polymerase using pRS3 13/1 xtetop as template. One PCR construct was 
amplified using the 5' terminal oligonucleotide Eco47 m-5' (SEQ ID NO: 1) 
containing an EcoATUi restriction site and a 3 '-oligonucleotide designated Mlu 
I 3' containing a Mlul restriction site. 
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Mlu I 3' SEQ ID NO: 5 

5 '-CGCACGCGTCG AAGAAATCACATTACTTTATATA 

A second PCR product was amplified using the 3'-terminal oligonucleotide 
Nhe I 3' (SEQ ID NO: 4) containing a Nhel restriction site and a 5'- 
5 oligonucleotide designated Mlu I 5' containing a Mlul restriction site. 

Mlu I 5' SEQ ID NO: 6 

5 ' -CGC ACGCGTATACTA AAAAATG AGC AGGC AAG 

The first PCR product was isolated and digested with EcoAim and Mlul, 
while the second PCR product was isolated and digested with Mlul and A7i£l. 

10 These digested products were isolated and ligatcd in a triple ligation with 
pRS313 previously digested with EcoAlUl and Nhel. The resulting piasmid 
was designated pRS313/lxtetop-MluI. DNA sequencing confirmed the 
presence of the Mlul site in pRS313/lxtetop-MluI and confirmed that integrity 
of the Eco47Ul and Ntiel junctions were maintained. 

15 A pRS303/lxtetop-MluI piasmid was constructed by first 

removing the EcoAlWINhel fragment containing the altered HIS3 promoter 
from the pRS313/lxtetop-Af/wI vector and Hgating the isolated fragment into 
pRS303 previously digested with £co47III and Nhel. DNA sequencing 
confirmed proper insertion of the Eco4TJJl/NheI fragment. 

20 B. pRS303/2xtetop -LYS2 

One copy each of the let operator sequence was engineered into 
positions -53 and -22 in the HIS3 promoter of pRS303 [Sikorski, et aL, 
Genetics 122:19-27 (1989)]. PCR was utilized to engineer one copy into 
position -53 which resulted in piasmid pRS303/lxtetop. To insert the second 
25 copy, a Mlul site was introduced at position -22 in the HIS3 promoter using 
PCR. The new piasmid was designated pRS303/lxtetop-MluL 

The tet operator was created by annealing two complementary 
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oligonucleotides tetop-1 and tetop-2. 



tetop-1 SEQ ID NO: 7 

5'-CGCGTACTCTATCATTGATAGAGTA; 

tetop-2 SEQ ID NO: 8 

5 ' - ATG AG ATAGTAACTATCTC ATGCGC 



When annealed, the tet operator sequence contains flanking Mlul sites. Both 
oligonucleotides were phosphorylated using 74 polynucleotide kinase (Gibco 
BRL, Grand Island, NY) at 37°C for one hour and annealed by first heating 
at 70°C for 10 minutes and then cooling to room temperature. The annealed 
10 oligonucleotides were isolated and ligated into pRS303/lxtetop-Af/«I 
previously digested with MM. The resulting plasmid was designated 
pRS303/2xtetop. DNA sequencing confirmed insertion of one copy of the tet 
operator sequence in the Mlul site. 

The LYS2 gene was digested from pLYS2 [Hollenbei^, S.M. 
15 etaL, Mol.Cell.BioL 15:3813-3822 (1995)] with EcoRI and Hindm and the 
isolated fragment blunt ended using the large fragment of DNA polymerase 
I (Gibco BRL, Grand Island, NY). Phosphorylated Sstl linkers (New England 
Biolabs, Beverly, MA) were ligated to the fragment, the fragment digested 
with Sstl, and the resulting fragment ligated into pRS313 previously digested 
20 with Sstl. The resulting plasmid was designated pRS313/LYS2. 

The LYS2 fragment was removed from pRS313/LYS2 with Sstl 
digestion and inserted into pRS303/2xtetop previously digested with Sstl. The 
resulting plasmid was designated pRS303/2xtetop-LYS2. 

Similarly, the LYS2 Sstl fragment was inserted into 
25 pRS303/lxtetop-MluI previously digested with Sstl yield pRS303/lxtetop- 
MluI-LYS2. 



C. pRS303/3xtetop-LYS2 
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Two copies of the tet operator sequence were created by self- 
annealing a palindromic oligonucleotide Tctop 2x with itself. 

Tetop 2x SEQ ID NO: 9 

5'-CGCGTACTCTATCATTGATAGAGTCTAGACTCTATCAATGATAGA(jTA 

5 The annealed oligonucleotide contained flanking Mlul sites. The 
oligonucleotide was phosphorylated, annealed, and isolated as above. The 
isolated annealed and Af/ul-digested oligonucleotide was ligated into 
pRS303/lxtetop-A«wI-LYS2 previously digested with Mlul to yield 
pRS303/3xtetop-LYS2. The presence of two copies of the tet operator 
10 sequence in the Mlul site was confirmed by DNA sequencing. 

D. pRS303/4xtetop-LYS2 and oRS303/8xtetop-LYS2 

Three or seven copies of the tet operator were created using 

PCR with Vent DNA polymerase as described above. Plasmid pUHC-13-3 

[Grossen and Bujarg, Proc. Natl Acad. ScL (USA) 89:5547-5551 (1992)] was 

15 used as template DNA using 5'- and 3'- oligonucleotides, Mlu I/Sph I 5' and 

MIu I Sph I 3', containing an exterior Mlul restriction enzyme site nested 

internally by a Sphl restriction enzyme site. 

Mlu I/Sph 1 5' SEQ ID NO: 10 

5 ' -GCG ACGCGTGC ATGCCGTCTTC AAGAATTCCTCG AG 

20 Mlu I Sph 1 3' SEQ ID NO: i 1 

5'-GCGACGCGTGCATGCCCACCGTACACGCCTACTCGA 

The PCR products were separated on an agarose gel and the ladder of 
differenl sized DNA fragments was isolated, digested with Mlul, and ligated 
into the Mlul restriction site of pRS303/Ixtetop-MluI-LYS2. DNA sequenc- 
25 ing revealed that either three or seven copies of tet operators were inserted 
into the Mlu site of pRS303/lxtetop-Af/wI-LYS2 to provide either 
pRS303/4xtetop-LYS2 or pRS303/8xtetop-LYS2. 
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E. pRS303/6xtetop-LYS2 and pRS303/10xtetop-LYS2 

A Sphl restriction enzyme site was introduced at position -85 
in the HIS3 promoter of pRS303/3xtetop-LYS2 using PCR with Vent DNA 
polymerase as described. Plasmid pRS303/3xtetop-LYS2 was used as a 
template DNA. A first fragment was amplified using the 5 '-terminal 
oligonucleotide Eco47 m-5' (SEQ ID NO: 1) described above containing an 
EcoAlTSi restriction site and a 3 ' oligonucleotide Sph I 3' containing a Sphl 
restriction site. 



Sph I 3' SEQ ID NO: 12 

10 5 ' -C ATGGC ATGC A AAAAAAAAGAGTC ATCCGCTAGG 



A second PCR product was amplified using the 3'terminal oligonucleotide 
Nhe I 3' (SEQ ID NO: 4) described above containing a Nhel restriction site 
and a 5 ' oligonucleotide containing a Sphl restriction site. 

Sph I 5' SEQ ID NO: 13 

15 5'CATGGCATGCTTAGCGATTGGCATTATCACAT 

The PCR products were isolated as described above. The first PCR product 
was digested with Eco4im and Sphl, and the second PCR product was 
digested with Sphl and NheL Both digestion products were ligated in a triple 
ligation along with pRS303/3xtetop-LYS2 previously digested with both 

20 Eco47Ul and Nhel. The resulting plasmid was designated pRS303/3xtetop- 
SphI-LYS2. The presence of the Sphl site in pRS303/3xtetop-SphI-LYS2 was 
confirmed by DNA sequencing analysis. 

Three copies of tet operators were isolated as a single fragment 
by digesting pRS303/4xtetop-LYS2 with Sphl. The isolated fragment was 

25 ligated into the Sphl site of pRS303/3xtetop-S/>M-LYS2 to yield 
pRS303/6xtetop-LYS2. The presence of three additional copies of the tet 
operator in pRS303/6xtyetop-LYS2 at the Sphl site was confirmed by DNA 
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sequencing. 

Seven copies of tet operators were isolated as a single fragment 
by digesting pRS303/8xtetop-LYS2 with Sphl. The isolated fragment was 
ligated into the Sphl site of pRS303/3xtetop-S/?M-LYS2 to yield 
5 pRS303/10xtetop-LYS2. The presence of seven additional copies of the tet 
operator in pRS303/10xtetop-LYS2 at the Sphl site was confirmed by DNA 
sequencing. 

F. P RS313/MluI and pRS303/MluI 

A Mlul restriction enzyme site was engineered into position -22 

10 in the HIS3 promoter of pRS3 1 3 utilizing PCR and Vent DNA polymerase as 
noted above. Plasmid pRS313 was used as a template for these PCR 
reactions. One PCR construct was amplified using the 5 ' terminal 
oligonucleotide Eco47 m-5' (SEQ ID NO: 1) containing an Eco47UI 
restriction site and a 3 ' oligonucleotide Mlu I 3' (SEQ ID NO: 5) containing 

15 a Mlul restriction site. A second PCR product was amplified using the 3' 
terminal oligonucleotide Nhe I 3' (SEQ ID NO: 4) containing a Nhel 
restriction site and the 5 ' oligonucleotide Mlu I 5' (SEQ ID NO: 6) containing 
a MltA restriction site. The first PCR product was isolated and digested with 
Eco47Ul and Mlul, while the second PCR product was isolated and digested 

20 with Mlul and Nhel. The digested products were partially purified and joined 
in a triple ligation with pRS313 which had been previously digested with 
Eco47m and NheL The resulting plasmid was designated pRS313/MluI. 
DNA sequencing confirmed the presence of the Mlul site in pRS31 3/Mlul and 
to confirm the integrity of the Eco47Hl and Nliel junctions. 

25 pRS303/MluI was constructed in exactly the same manner as 

pRS313/MluI except that pRS303 was used in place of pRS313. 



G. 



pRS313/lxtetop 

See above wherein pRS313/lxtetop is an intermediate in the 
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construction of pRS303/lxtetop-MluI. 

H. pRS313/MhiMxtetop and pRS303/MluI-l xtetop 

One copy of the tet operator sequence was created by annealing 
two complementary oligonucleotides tetop-1 and tetop-2 (SEQ ID NO: 7 and 
5 SEQ ID NO: 8). The annealed tet operator sequence contains flanking Mlul 
sites. The oligonucleotides were phosphorylated using 74 polynucleotide 
kinase (Gibco BRL, Grand Island, NY) at 37°C for one hour and annealed by 
first heating at 70°C for 10 minutes followed by cooling to room temperature. 
The annealed oligonucleotides were isolated and iigated separately into Mlul- 
10 digested pRS313/MluI and pRS303/MluI, the resulting plasmids being 
designated pRS3 1 3/MluI- 1 xtetop and pRS303/MIuI- 1 xtetop. DNA sequencing 
confirmed the presence of one copy of the tet operator in the Mlul sites of 
both plasmids. 

In order to produce plasmids bearing multiple copies of the tet 
15 operator, annealed oligonucleotides described above were Iigated together 
overnight at 16°C. After isolation of the ligation products, they were inserted 
into the Mlul of pRS313/MluL DNA sequencing analysis confirmed that one 
clone, pRS3 1 3/MluI-4xtetop, was produced which contained four copies of tet 
operator in the Mlul site. However, upon further examination of this clone 
20 it was discovered that it had been subjected to a recombination event and was 
therefore not useful for farther cloning steps. Continued attempts to insert 
multiple copies of the tet operator into the Mlul site of pRS313/MluI by 
ligating muitimers of the tet operator have been unsuccessful. 



25 



I. pRS313/lxtetop -MlnI 

See above wherein construction of pRS3I3/lxtetop-MluI was 
an intermediate in the construction of pRS303/lxtetop-MluI. 
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J. pRS313/2xtfttr>p 

One copy of the tet operator sequence was created using 
annealed complementary oligonucleotides tetop-1 and tetop-2 (SEQ ID NO: 
7 and SEQ ID NO: 8). Annealed oligonucleotides were ligated into the Mlul 
5 site of pRS313/lxtetop-MluI to yield pRS313/2xtetop. DNA sequencing 
confirmed the presence of two copies of the let operator in the Mlul site. 

K. pRS303/2xtetop 

See above wherein pRS303/2xtetop was an intermediate in the 
construction of pRS303/2x/tetop-LYS2. 

10 L. pRS3 1 3/LYS2 and dRS3 1 3/1 -YS7 

The LYS2 gene was digested from pLYS2 with £coRI and 
Hindm digestion. The £coRI/7/mdm fragment was blunt ended using the 
large fragment of DNA polymerase I (Gibco BRL, Grand Island, NY) and 
ligated with phosphorylated Sstl linkers (New England Biolabs, Beverly, MA). 

15 The resulting fragment was digested with Sstl and ligated into pRS313 
previously digested with Sstl. The resulting plasmid was designated 
pRS313/LYS2. Because the LYS2 fragment was shown to have inserted into 
pRS313 in both orientations, plasmids with the LYS2 gene in both orientations 
were transformed separately into the yeast strain SEY6210a_(M47a_ leu2- 

20 3,112 urai-52 his3-A2O0 trpl-A901 lys2-801 suc2-A9 [Robinson et at., Mol. 
Cell. Biol. 8:4936-4948 (1988)]. Both clones allowed the yeast to grow in the 
absence of lysine indicating that orientation of the LYS2 gene in pRS313 did 
not affect the expression of an active gene. 

The LYS2 fragment was removed from pRS313/LYS2 with Sstl 

25 and ligated into the Sstl site of: 

pRS313/lxtetop-MluI giving plasmid pRS313/lxtetop-MluI-LYS2, 
pRS313/2xtctop giving plasmid pRS313/2xtetop-LYS2, 
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pRS303/lxtetop-MluI giving plasmid pRS303/lxtetop-MluI-LYS2, and 
pRS303/2xtetop giving plasmid pRS303/2xtetop-LYS2. 

n, Plasmids Encoding Reporter Gene TetR 
A. pRS306/mS3:TetR/Term 
5 The 5' promoter sequence of the yeast HIS3 gene, 

encompassing nucleotides -75 to +23, was ligated to the translational start of 
TetR. In addition, the DNA sequence encoding the simian vims 40 (SV40) 
large T antigen nuclear localization signal was ligated in frame with the 
nucleotide sequence encoding the last amino acid residue of TetR. The 
10 chimeric fragment was created by the same PCR strategy as described above. 

The HIS3 promoter fragment, the primary 5'-PCR product, was 
amplified by PCR from plasmid p601 fGrueneberg,D.A., Science 257:1089- 
1095 (1992)] using a 5 '-terminal oligonucleotide T7 Promoter primer and a 
3'-inner oligonucleotide 3'-TetR inner primer. 

15 T7 Promoter primer SEQ ID NO: 14 

5 ' -TAATACG ACTC ACTATATAGGG 

3 ' TetR inner primer SEQ ID NO: 15 

5 9 -TCTAG ACTTTGCCTTCGTTTATC 

The primary 3' PCR product containing the TetR coding sequence was 
20 amplified from pSLF104 [Forsburg, Nucl. Acid. Res. 21:2955-2956 (1993)] 
with a 5'-inner oligonucleotide 5'-TetR inner primer and a 3'-terminaI 
oligonucleotide 3' -TetR terminal primer. 

5' TetR inner primer SEQ ID NO: 16 

5'-CGAAGGCAAAGATGTCTAGATTAGATAAAAG 

25 3'-TetR terminal primer SEQ ID NO: 17 

5'-CGCGGATCCGCTTTCTCTTCTTTTTTGGAGACCCACTTTCACATT^ 
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An EcoRI site derived from the p601 fragment and a BamBI site in the 3'- 
terminal oligonucleotide were used in subsequent subcloning. The PCR 
products were gel-purified and amplified in a second PCR reaction with 5'- 
and 3-' terminal oligonucleotides, T7 Promoter primer (SEQ ID NO: 14) and 
5 3'-TetR terminal primer (SEQ ID NO: 17). The secondary PCR product was 
isolated, digested with EcoRI and BamHl, and ligated into pRS306/Term 
previously digested with £o>RI and BamHI. The resulting plasmid was 
designated pRS306/fflS3:TetR/Term which comprises the complete TetR 
coding sequence in frame with sequences encoding the nuclear localization 
10 signal of SV40 large T antigen. 

B. pRS3 1 6/HIS3 :TetR/Term 

The construction protocol for this plasmid was the same as 
described above for subcloning a HIS3 DNA into pRS306/Term except chat 
the vector for subcloning was pRS316/Tenn described above. 



15 C. PRS306/ 1 xLex Aop/fflS3 :TetR 

Oligonucleotides LexAop (100a) and LexAop (100b) containing 
a single copy of LexA operator were phosphorylated with T4 polynucleotide 
kinase (Gibco BRL, Grand Island, NY) at 37°C for one hour. 



LexAop (100a) SEQ ID NO: 18 

20 5 AATTGCTCG AGTACTGTATGTACATACAGTAG 

LexAop (100b) SEQ ID NO: 19 

5 - AATTCTACTGTATGTAC ATAC AGTACTCG AGC 



Following phosphorylation, the oligonucleotides were annealed by heating at 
70°C for 10 minutes followed by cooling to room temperature. The annealed 
25 oligonucleotide containing 5 ' and 3 * EcoSl overhanging ends was subcloned 
into pRS306/fflS3:TetR/Term previously digested with £a?RL The number 
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of copies of inserted oligonucleotide was confirmed by DNA sequencing. The 
plasmid containing a single copy of the LexA operator was designated 
pRS306/ 1 xLex Aop/HIS3 :TetR. 



D. pRS31 6/2xLexAop/HIS3:TetR 
5 The subcloning protocol for this construct was the same as 

described above for pRS306/lxLexAop/fflS3:TetR. The annealed 
oligonucleotides encoding the LexA operator included overhanging £coRI ends 
and during ligation, the individual annealed fragments were able to 
multimerize, inserting into the parental plasmid more than one copy of the 
10 desired LexA sequence. The number of copies of inserted oligonucleotides 
was confirmed by DNA sequencing. 



E. pRS306/2xLex Aop/fflS3:TetR 

A DNA fragment containing two copies of LexA operator and 
the chimeric ///S3:TetR reporter was excised from 
15 pRS316/2xLexAop/HIS3:TetR by digestion with Kpnl and BamiU restriction 
enzymes. The fragment was gel-purified and subcloned into pRS306/Term 
previously digested with Kpnl and BamUl and the resulting construct was 
sequenced to confirm the presence of two copies of the LexA operator. 



F. pRS306/4xLexAop/HIS3:TetR 
20 and pRS306/8xLexAop/HrS3:TetR 

A pair of oligonucleotides SH101 A and SH101B were utilized 

in PCR to amplify the LexA binding site multimer from the plasmid SH18- 

34ASpe [Hollenberg, S.M., etai, Mol.CelLBiol. 15:3813-3822 (1995)]. 

SH101A SEQ ID NO: 20 

25 5 ' -CCGG AATTCTCGAG AC ATATCC ATATCTAATC 

SH101B SEQ ID NO: 21 

5 ' -CCGG AATTCACTAATCGC ATTATC ATC 
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The amplification product containing four copies of LexA operator was gel- 
purifted, digested with £a>RI, and subcloned into pRS306/fflS3:TetR/Term 
previously digested with £coRI. The number of LexA operators were 
determined by DNA sequencing. 

S G. pRS306/8xLexAop/HIS3 : :TetR 

A PCR strategy was used to link the 5' promoter sequence of 
the yeast HIS3 gene encompassing nucleotides-75 to +23 to the translational 
start of TetR. Sequences encoding the SV40 large T antigen nuclear local- 
ization signal were fused in frame with the nucleotide sequence encoding the 

10 last amino acid residue of TetR. The PCR product was digested with EcoBl 
and BamW and inserted into pRS306/Term previously digested with EcoRl 
and BamM. The resulting plasmid was designated pRS306/fflS3:TetR/Term, 
and was shown to encode the complete TetR protein in frame with the nuclear 
localization signal of SV40 large T antigen. The fusion protein is followed 

15 by four amino acids generated by the vector backbone (Arg-Ile-His-Asp). 

The LexA binding site multimer from the plasmid pSH18- 
34ASpe [Hollenberg, S.M. etaL, MoL Cell. Biol. 15:3813-3822 (1995)] was 
amplified by PCR, digested with £coRI, and subcloned into the EcoRI site of 
pRS306/fflS3:TetR/Term resulting in plasmid pRS306/8xLexAop/TetR. 

20 H. pADH/TetR 

The DNA coding sequence of TetR was amplified by PCR from 
pSLF104 using two oligonucleotides, NcoI-TetRand 3 '-TetR terminal primer 
(SEQ ID NO: 17). 



25 



NcoI-TetR SEQ ID NO: 22 

5 '-C ATGCCATGGCC ATGTCTAGATTAG ATAAAAG 
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The resulting product was gel-purified, digested with Ncol and BamHI, and 
subcloned into a pBTMU6 [Bartel, et aL, in Cellular Interactions in 
Development: a Practical Approach , Hartley (ed.), IRL Press; Oxford, pp. 
153-179 (1993)] shuttle vector containing an ADH promoter, previously 
5 digested with Ncol and BamHl. For construction of this vector, DNA 
generated by PCR and DNA obtained by restriction enzyme digestion of the 
polylinker region in plasmid pBluescript (Stratagene, La Jolla, California) 
were used to engineer additional restriction sites 5 ' and 3 ' of the ADH 
promoter. The TetR protein encoded from this construct is expressed 
1 0 containing additional amino acids Met~ 2 -Ala" 1 before the initiating methionine 
and also contains the nuclear localization signal of SV40 large T antigen 
located after the last amino acid of TetR as described above. 

I. nRS306/ADH:TetR/Term 

A fragment encoding the ADH promoter and TetR was removed 
15 from plasmid pADH/TetR with Xhol and blunted-ended with the large 
fragment of DNA polymerase I (Gibco BLR, Grand Island, NY). £coRI 
linkers (New England BioLabs, Beverly, MA) were added and the fragment 
was digested with EcoW and BamHL. The resulting fragment was gel-purified 
and ligated into pRS306/Term previously digested with EcoM and BamHl. 

20 J . pRS306/4xLexAop/ADH: ;TetR 

and DRS306/8xLexAop/ADH::TetR 

The subcloning protocol used to insert multiple copies of the 
LexA operator into pRS306/ ADH : TetR/Tenn was the same as described 
previously for pRS306/4xLex A op /HIS 3 : TetR and 
25 pRS306/8xLexAop/HIS3:TetR. 
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III. Plasmids Encoding Binding Proteins 

A. pLexA-CBD 

A DNA fragment containing the CREB binding domain of CBP 
(CBD), amino acids 461-682, was PCR amplified from piasmid CBP-0.8 
5 [Chrivia, J.C. et ai t Nature 365:855-859 (1993)] using a pair of 
oligonucleotides designated 5' CBD primer and 3' CBD primer. 

5' CBD primer SEQ ID NO: 23 

5 '-GCGAATTCGCC AGGGCAACAG AATGCCACT 

3' CBD primer SEQ ID NO: 24 

10 5 ' -CGGGATCCTGGCTGGTTACCC AGG ATGCCTTG 



Following gel purification, the amplification product was digested with EcdSl 
and BamHI, and ligated into piasmid pBTM116 fBartel, et ai % in Cellular 
Interactions in Development: a Practical Approach, (ed) Hartley, D.A. (IRL 
Press, Oxford), pp. 153-179 (1993)] previously digested with £coRI and 
15 BamHI. 

B. DVP16-CBD 

A DNA fragment encoding the CBP sequence was excised from 
pLexA-CBD by digestion with EcoRI and BamHI. Piasmid pLexA-CBD was 
linearized with EcoiBl digestion, the resulting overhanging ends blunt-ended 
20 using the Klenow fragment of DNA polymerase I, and the ends ligated with 
BamHI linkers. The resulting fragment was inserted into pVP16 [Hollenberg, 
et al., MoL Cell. Biol. 15:3813-3822 (1995)] previously digested with into 
BamHI. 
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C. PVP16 CREB 

Plasmid pcDNA3/CREB283 [Sun and Maurer, J. Biol Chem. 
270:7041-7044 (1995)], containing the VP16 transactivation domain fused to 
sequences of the rat CREB transactivation domain (1 to 283 aa) was linearized 
5 with Xhol and BamHl linkers (New England BioLab) ligated to the resulting 
blunt-ended Xhol sites. DNA encoding the VP16/CREB chimeric protein was 
removed with HindDI and BamHI digestion and following gel purification, 
ligated into the HindDI and BamHl sites of pVPI6 which encodes the LEU2 
gene. 

10 D. pVPI 6-CREBfBgin-Sacm-LacZ 

A DNA fragment encoding #-galactosidase was PCR amplified 
from plasmid pSV-/J-galactosidase vector (Promega, Madison, WI) using a 
pair of oligonucleotides, 5 * 0-gal primer and 3 ' 0-gaI primer and inserted into 
the NotI site of pVP16 to produce pVP16-LacZ. 

15 5 ' 0-gal primer SEQ ID NO: 29 

5 ' -ATGGTACC AGCGGCCGCTAGTCGTTTTAC AACGTCGTG AC 

3 ' /3-gal primer SEQ ID NO: 30 

5-ATGGTACCGCGGCCGCTTATTTTTGACACCAGACCAAC 

A PCR fragment containing CREB sequences encoding amino acid residues 
20 1 to 283 was amplified from plasmid pRSV-CREB341 [Kwok, et al, Nature 
380: 642-646 (1996)] using a pair of oligonucleotides, 5 ' CREB 341 primer 
and 3 ' CREB 283 primer, and inserted into pVPI 6-LacZ vector at the BamHI 
site. 

5 ' CREB 341 primer SEQ ID NO: 25 

„ 25 5 -CGCGGATCCGGATGACCATGGACTCTGGAG 



3 ' CREB 283 primer SEQ ID NO: 28 

5 -CGCGGATCCGTGCTGCTTCTTCAGCAGGCTG 
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To generate a cassette vector for producing and subcloning mutated CREB 
sequences as described below, PCR was used to engineer a BgRl site using 
oligonucleotides 5 ' Bgia primer and 3 ' BgOl primer, at nucleotides 273 to 
278 and a SacTL site using oligonucleotides 5 1 SacU primer and 3' SacU 
5 primer at nucleotides 500 to 505 of the CREB activation domain. 

5 ' Bgai primer SEQ ID NO: 31 

5 -CGGAGATCTAAAGAGACTTTTCTCCGGAACTCAG 

3 ' BgWL primer SEQ ID NO: 32 

5 ' -CGG AG ATCTTTAC AGG A AG ACTG AACTGT 

10 5 SocII primer SEQ ID NO: 33 

5 -CCACCGCGGCAGTGCCAACCCCGATTTAC 

3 ' SacU primer SEQ ID NO: 34 

3 '-CATCCGCGGTGGTGATGGCAGGGGCTGA 

E. DLexA-CREB 283 

15 A DNA fragment containing the rat CREB transactivation 

domain (amino acids 1 to 283) was excised from pcDNA/CREB283 [Sun and 
Maurer, supra] with Smal and Xbdl digestion. The 5 ' Xbal site was blunt 
ended with the large fragment of DNA polymerase I (Gibco BRL, Grand 
Island, NY) and Sail linkers (New England Biolabs, Beverly, MA) added. 

20 The fragment was digested with Sail and subcloned into the Sail site of 
pBTM116. 

F. pLexA-CREB 341 

A DNA fragment containing the rat CREB 341 cDNA was 
amplified by PCR from pcDNA/CREB341 [Kwok, supra] using a pair of 
25 oligonucleotides, 5 ' CREB 341 primer (SEQ ID NO: 25) and 3 ' CREB 341 
primer. 
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3 ' CREB 341 primer SEQ ID NO: 26 

5 -CGCGGATCCTTAATCTGACTTGTGGCAGTA 

After gel purification, the PCR product was digested with BatriHl, and 
subcloned into the BamHl site of pBTMl 16. 

5 G. pLexA-rRPR 341 -Ml 

A DNA fragment containing the rat CREB sequence with a 
mutation changing serine at position 133 to alanine was amplified by PCR 
from plasmid Rc/RSV CREB-M1 [Kwok, et al. , supra] using the same set of 
primers as described for pLexA-CREB 341, 5 ' CREB 341 primer (SEQ ID 
10 NO: 25) and 3' CREB 341 primer (SEQ ID NO: 26). The resulting 
amplification product was gel-purified, digested with BamHl, and subcloned 
into the BamHl site of pBTMl 16. 

H. pVP16-CREB Ml 

A PCR fragment containing CREB sequences coding for amino 
15 acid residues 1 to 283 including the serine 133 mutation to alanine was 
amplified using a pair of oligonucleotides, 5 ' CREB 283 primer and 3 ' CREB 
283 primer (SEQ ID NO: 28). The PCR fragment was gel-purified, digested 
with BamHl and inserted into the BamHl site of pVP16. 

5 * CREB 283 primer SEQ ID NO: 27 

20 5 -CGCGGATCCCCATGACCATGGAATCTGGAGCC 

I. pLexA-SRF 

A DNA fragment containing human SRF was excised from 
plasmid pCGN-SRF [Grueneberg, D.A., et al., Science, 257:1089-1095 
(1992)] with Xhol and BamHl digestion. The Xhol site of the fragment was 
!5 blunt-ended by the large fragment of DNA polymerase I (Gibco BRL, Grand 
Island, NY), ligated with BamHl linkers, digested with BamHl, and inserted 



WO 98/13502 



PCT/US97/17276 



- 30 - 

into pBTMl 16 previously digested with BamHl. 



J. pVP16-Tax 

A DNA sequence encoding full length Tax protein was excised 
from pS6424 [Kwok, R.P.S., et aL , Nature 380:642-646 (1996)] with BamHl 
5 digestion and was inserted into pVP16 previously digested with BamHl. 

IV. Plasrnids For Binding Protein Controls 

A. pLeu 

Plasmid pVP16 was digested with Hindm and BamHl to 
remove the fragment encoding the VP1 6 transactivation domain. The digested 
10 vector was blunt-ended and self-ligated. 



B. DLexA-VP16 

The VP 16 transactivation domain was PCR amplified from 
pGaI-VP16 [Sadowski, et aL, Nature 335:563-564 (1988)] with a pair of 
oligonucleotides, 5 -VP16SH and 3 VP16SH and the resulting amplification 
15 product was digested with Clal, blunt-ended, and inserted into pBTM116. 



5 -VP16SH SEQ ID NO: 35 

GGCTATCGATACGGCCCCCCCGACCGAT 

3 -VP16SH SEQ ID NO: 36 

GCGTATCGATCTACCCACCGTACTCGTC 



20 C. pLexA-Lamin 

See Hollenberg, S.M. et aL, MoLCell.BioL 15:3813-3822 

(1995)]. 
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v - Plasmids Encoding Reporter Gene Controls 

A. DRS3Q6/Term 

The alcohol dehydrogenase (ADH) terminator sequence was 
excised from plasmid pBTMl 16 rBartel, et ai, in Cellular Interactions in 
5 Development: a Practical Approach, (ed) Hartley, D. A. (IRL Press, Oxford), 
pp. 153-179 (1993)] with SphI and Pstl restriction enzymes and both 3'- 
overhanging sequences were blunted by T4 DNA polymerase (Gibco BLR, 
Grand Island. NY). The fragment was gel-purified and subcloned into the 
blunt-ended Notl site in pRS306 fSikorski and Hieter, Genetics: 122: 19-27 
10 (1989)J. The orientation of inserted fragment was determined by DNA 
sequencing. 

B. pRS316/Tertn 

The subcloning protocol for inserting the ADH terminator 
sequence into pRS316 was the same as described for inserting the ADH 
15 sequence in pRS306. 

Example 2 
Generation of Yeast Assay Transformant 

Selection of an appropriate yeast assay strain is an empirical 

determination based on growth characteristics of the transformed alternatives. 

20 A general method to make the appropriate selection is described as follows. 

Candidate yeast assay strains were transformed individually with 
reporter gene constructs and/or a plasmid encoding one of the experimental 
binding proteins. Assay strains thus transformed were then compared for 
relative differences in growth characteristics, with an optimal assay strain 

15 showing negligible growth on media lacking histidine and vigorous growth on 
media containing histidine. In practical application of this first step in selection 
using various plasmids transformed into assay strain YI584, the following 
results were observed. 
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When the plasmid pLexA-VP16 encoding both the LexA DNA 
binding domain and the VP16 transactivating domain as a single protein was 
introduced into the assay cells, growth in the absence of histidine in the media 
was significantly reduced three days after transformation. 
S In assays including transformation with plasmids encoding 

multiple copies of the tet operator upstream of the HIS3 gene, the following 
plasmids were separately utilized: 

pRS303/lxtetop-///S (encoding a single tet operator sequence), 
pRS303/2xtetop-///.S (encoding two tet operator sequences), 
10 pRS303/3xtetop-J//S (encoding three tet operator sequences), 
pRS303/4xtetop-//7S (encoding four tet operator sequences), 
pRS303/6xtetop-///S (encoding six tet operator sequences), 
pRS303/8xtetop-///S (encoding eight tet operator sequences), or 
pRS303/10xtetop-///S (encoding ten tet operator sequences). 

15 In the assay strains transformed with plasmids encoding either one, two, or 
three copies of the tet operator upstream from the HIS3 gene, cells grew on 
media lacking histidine at a rate similar to cells grown on media containing 
histidine. In yeast assay strains transformed with plasmids encoding either 
six, eight, or ten copies of the tet operator upstream from the HIS3 gene, cell 

20 growth was low suggesting that these strains would not be useful in assays to 
examine binding and interruption of binding between test proteins, ITiese 
results suggested that, in assay strains transformed with a reporter plasmid 
having more than three tet operator sequences upstream from the HIS3 gene, 
normal activity of the HIS3 promoter is disrupted and that these plasmids 

25 would not be useful. 

In assays wherein yeast cells were transformed with only 
reporter plasmids (and not plasmids encoding binding partner fusion proteins) 
encoding multiple copies of the LexA operator 5 ' of the TetR gene, the 
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following results were observed. Growth of assay cells transformed with 
plasmids bearing one, two, four, and eight copies of the regulatory LexA 
operator upstream of the TetR gene appeared to be "copy number" dependent. 
Yeast cells transformed with plasmids having two copies of the LexA operator 
5 grew at a rate significantly higher than those assay cell transformed with a 
plasmid bearing only one copy of the operator. Cells transformed with 
plasmids encoding either four or eight LexA operators upstream of the TetR 
gene grew at an approximately equal rate, and better than assay cells bearing 
a TetR gene driven by two copies of the operator. 
1 0 When the alcohol dehydrogenase (ADH) promoter was included 

upstream of the LexA operator (plasmids encoding either four or eight LexA 
operators) in the various reporter gene constructs, cell viability was the 
lowest. 

The various cell lines constructed by the methods described 
15 above are shown in Table I, wherein various transformed yeast strains are 
identified (Strain ft) along with the number of LexA operator sequences in the 
plasmid encoding TetR, the number of tetracycline operator sequences 
regulating expression of HIS3, and relative growth rate of the transformed 
strain on media containing histidine. It is important to note that growth 
20 variation of transformed cells in media containing histidine is observed, even 
in cell lines identically transformed. The number of " + " signs in Table 1 is 
indicative of the host cell's relative ability to grow on media lacking histidine 
in the absence of transformation with plasmids encoding potential binding 
proteins. Also in Table 1, a subscript "a" is indicative of transformation with 
15 a plasmid bearing the alcohol dehydrogenase promoter; absence of a subscript 
"a" indicates use of the HIS3 promoter. 
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Table 1 

Various Yeast Transformants 



15 



Diploids LAO 






Diploids 


L40 






Strain # 


Lex A TctOp Hifc + 


Strain $ 


LcxA TetOp 


His + 


YI579 


IX 


2X 


+ + + 


YI602 


«X a 


6X 




Y1S8I 


IX 


2X 


+ + •+ 


YI607 


4X 


6X 


+ + + 










Y1628 


4X 


6X 


+ + -» 


Y1580 


2X 


2X 


+ + + 


YI632 


4X a 


6X 




YI582 


2X 


2X 


+ + + 


















YI605 


<X a 


I0X 












YI6I0 


4X 


10X 


+ 


Diploids L40 






YI622 


4X 


10X 


+ H 


Strain # 


Lex A TetOp His + 


YI62b 


4 * M 


10X 




YI583 


4X 


2X 












YI585 


4X 


2X 


+ + + 


YI592 


8X 


2X 


+ + + 


YI5R7 


4X 


2X 


+ + + 


YI59o 


8X„ 


2X 


+ + + 


YI589 


4X 


2X 


+ + + 


















YI59H 


8X 


4X 


+ 


YI584 


8X 


2X 


+ + + 


YI635 


8X a 


4X 


+ 


YI586 


8X 


2X 


+ + + 


YI637 


8X 


4X 


+ + 


YI588 


8X 


2X 


+ + + 


YI601 


8X 


6X 




YI.VJO 


8X 


2X 


+ + + 


YI608 


8X a 


6X 


+ 










YI629 


»X n 


6X 


+ + + 










YI63I 


8X 


6X 


+ + + 



20 



25 



30 



35 



Diploids L40 






YI604 


8X 


I0X 


+ 




Strain tf 


LexA 


TetOp His+ 


YI6II 


8X a 


10X 


+ 




YI59I 


2X 


2X 


+ + + 


YI623 


»X a 


I0X 


+ + 




YIS94 


2X 


2X 


+ + + 


YI625 


8X 


IOX 


+ + 




Y1597 


2X 


4X 














YI633 


2X 


4X 


+ 












YJ636 


2X 


4X 


+ + 


Strain* 


I^cxA 


TetOp strain 












YI664 


<x a 


3X 


w303(50) 


+ + + 


Y1600 


2X 


6X 




YI666 


<X 0 


3X 


w303(5 1 ) 


+ + + 


YI606 


2X 


6X 














YI630 


2X 


6X 


+ 


YI668 


4X„ 


2X 


L40 (69) 


+ + + 


YI627 


2X 


6X 


+ + + 


Y1670 




2X 


L40 (70) 


+ + + 


YI603 


2X 


10X 


+ 


YI665 


8X 0 


3X 


w303(50) 


+ + + 


Y162I 


2X 


I0X 


+ + 


YI667 


«X 0 


3X 


w303(5l) 


+ + + 


YI609 


2X 


I0X 


+ 


YI67I 


«X M 


3X 


L40 (69) 


+ + + 


YI624 


2X 


10X 


+ + 




















YI669 


8X, 


2X 


L40 (69) 


+ + + 


YI593 


4X 0 2X 


+ + + 


YI67I 


8X a 


2X 


L40 (70) 


+ + + 


YI595 


4X 


2X 


+ + + 




















YI67I 


«X a 


6X 


L40 (69) 


+ + + 


YI599 


4X a 


4X 














YI634 


4X 


4X 














YI638 


<X a 


4X 
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Example 3 
CREB/CBP Binding Interaction 

Use of the split-hybrid assay for studies of protein/protein 

binding wherein one of the binding components is randomly mutagenized was 

5 carried out using CREB and CBP binding proteins. The binding of CREB to 

CBP has been shown to require the phosphorylation of the CREB serine 

residue at position 133 in a region designated the "kinase-inducible domain" 

(KID) [Chrivia, et at, Nature 365, 855-859 (1993); Kwok, et ai y Nature 

370, 223-226 (1994)]. Functionally, changing serine at position 133 to 

10 alanine (a mutant designated CREB-M1) abolishes the ability of CBP to 
activate CREB-mediated transcription. Preliminary studies have indicated that 
the CREB-M1 mutant in the split-hybrid system prevents the interaction with 
CBP and subsequent growth of the yeast assay strain on media lacking 
histidine. Precisely what other portions of the KID of CREB are required for 

15 binding to CBP is unknown, however. To define other potentially important 
amino acid residues, the KID (amino acid residues 102 to 160) of CREB 341 
was randomly mutagenized using PCR. 

A. PCR Muta genesis and Creation of Mutant Library 

The technique used for mutagenic PCR was a modification of 

20 that described by Uppaluri and TowJe [MoL Celt. Biol. 15, 1499-1512 
(1995)]. The reaction mixture contained 20 ng of pVP16-CREB(BglII-SacII)- 
LacZ, 16 mM (NH 4 ) 2 S0 4 , 67 mM Tris-HCJ, pH 8.8, 6.1 mM MgCI 2 , 0.5 
mM MnCl 2 , 6.7 EDTA, 10 mM 0-mercaptoethanol, 1 mM primers, ImM 
each dGTP, dTTP, and dCTP, 400 fiM dATP, and 2.5 units of Taq DNA 

25 polymerase (Promega, Madison, WI). After seven cycles of PCR (94°C for 
40 sec, 50°C for 40 sec, and 72°C for 40 sec), the PCR product was 
amplified a second time using the same primers and Vent DNA polymerase 
(New England BioLabs, Beverly, MA) under the same conditions for 25 
cycles. The resultant PCR product was gel purified, digested with Bgm and 
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SacTL, and inserted into the BglR and Sacll sites of pVP16-CREB(BglII-SaeII)- 
LacZ (construction of which is described above). The resulting plasmids were 
transformed into DH5a bacterial cells. Transformants were pooled and 
plasmid DNA was isolated by CsCI gradient centrifugation. 

5 B. Construction and Use of pVPiftrRF BfBgin-SacnVLacZ 

A DNA fragment encoding the j8-galactosidase gene was fused 
in frame to the carboxyl-terminai end of VP16-CREB as described above. 
The carboxy-terminal tag allowed identification of clones that contain frame- 
shift and nonsense mutations; colonies that remain positive for 0-galactosidase 

10 were presumed to contain an open reading frame throughout the mutated 
region. To facilitate the subcloning of mutated sequences, a cassette version 
of the CREB cDNA was generated that contained BglH and a Shell sites 
flanking the 5 ' and 3 ' ends of the KID, respectively. These modifications 
altered the amino acid residue at position 168 from valine to alanine. The 

15 cDNA altered in this manner was indistinguishable from the original VP16- 
CREB and from VP16-CREB-LacZ when tested in the split hybrid assay. 
Primers complementary to regions flanking the KID were used in mutagenic 
PCR amplification reactions as described above under conditions which were 
optimized to achieve one to three mutations in the 177 bp region encoding the 

20 KID. PCR products were introduced into pVP16-CREB(B£m-SteII)-LacZ in 
place of wild-type sequence. A library of mutated sequences was transformed 
into yeast assay strain YI584 expressing LexA-CBD. Approximately 27 : 000 
yeast transformants were screened, yielding about 5,000 colonies that were 
capable of growing on selective media supplemented with 10 M g/ml of 

25 tetracycline and I mM of 3AT, determined as described below. 

Two screening steps were performed to eliminate uninformative 
mutations and false positives. First, filter 0-galactosidase assays were 
performed by standard methods [Vojtek, et al., Cell 74:205-214 (1993)1 on 
the 5,000 colonies which exhibited positive growth on media lacking 
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tryptophan, histidine, uracil, leucine, and lysine to eliminate expressed 
proteins having frame-shift and nonsense mutations. Five hundred thirty six 
colonies developed a dark blue color, whereas 412 colonies turned white and 
were presumed to express mutants containing either frame-shift or nonsense 
5 mutations. The other colonies developed a pale blue color, and control 
experiments suggested that these colonies may have expressed unstable lacZ 
fusion proteins. Pale blue colonies were not analyzed further. 

DNA from 536 dark blue colonies was isolated and transformed 
into E.coti MC1066 cells. One hundred ninety three pVP16-CREB-(BgUI- 

10 SacII)-LacZ cDNAs were then isolated. 

In a second screening step, the 193 cDNAs were separately re- 
transformed along with pLexA-CBD into the split-hybrid strain as well as into 
the two-hybrid L40 strain [Vqjtek, et al. % supra] in order to identify false 
positives and confirm that the mutant CREB proteins did not interact with 

15 CBP. Among the 193 cDNAs re-screened, 152 did not interact with CBP in 
the yeast two-hybrid system, 15 interacted weakly, and 26 interacted like wild 
type CREB. 

Following these two screening steps, the 152 CREB mutants 
were sequenced. Seventy CREB mutants were found to contain a single 

20 amino acid change. Sixty four CREB mutants contained two amino acid 
residue mutations and 13 mutants contained more than two amino acid 
mutations. Mutants containing more than one amino acid alteration were not 
analyzed further. The expression level of mutant proteins having one amino 
acid change were determined using a standard jS-galactosidase assay. 

25 The CREB mutations identified in the split-hybrid screen were 

shown to carry amino acid changes centered around the phosphorylation site 
at serine at position 133. No disrupting mutations were found to contain 
amino acid alterations outside of the region between amino acids 130 to 141. 
Most of the mutations abrogated the PKA phosphorylation region, but others 

30 were identified at isoleucine position 137, leucine at position 138, and leucine 
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at position 141. The mutations at positions 137, 138, and 141 generally 
changed the hydrophobic residues at these positions to polar residues. The 
ability of the split-hybrid system to detect only a limited number of CREB 
mutants, many of which have been proposed previously to disrupt CREB 
5 association with CBP [Parker, et al, Mol Cell Biol 16, 694-703. (1996)], 
indicates the specificity of the split-hybrid system. 

These results lead to interesting suggestions. Various CREB 
mutations were identified which disrupt CREB-CBP interaction and the 
majority of disrupting mutations occurred in the CREB PKA phosphorylation 

10 motif. This result was consistent with previous observations that 
nonphosphorylated CREB and CBP do not interact [Kwok, et al , Nature 
370:223-226 (1994)]. The most common motif for PKA phosphorylation is 
an RRX(S/T)X amino acid sequence but RX(S/T)X and KRXX(S/T)X arc also 
phosphorylated [Kemp and Pearson, T.I.B.S. 15, 342-346 (1990)]. The 

15 arginine residues in the phosphorylation site are critical for electrostatic 
interactions with acidic amino acid residues in the catalytic subunit of PKA 
[Knighton, et al, Science 253, 414-420 (1991)], and consistent with this 
observation, CREB mutants with changes at arginine residues 130 and 131 
were identified in the split hybrid assay that did not interact with CBP. 

20 Results also showed that CREB mutations at amino acids 

proline at residue 132 and tyrosine 134 were unable to bind CBP. It is likely 
that the mutations at these residues adversely affect the structure of the 
phosphorylation motif, although these positions are generally thought to be 
less critical to CBP binding. It is possible that the substitution of proline at 

25 position 132 with threonine created a new phosphorylation site (RXTX) that 
interfered with the critical phosphorylation of serine at position 133. 
Although not generally thought to be part of the "classical" consensus PKA 
phosphorylation motif, hydrophobic amino acids are commonly found 
carboxy-terminal to PKA sites [Kemp, et al, 7./.B.S. 19:440-444 (1994)]. 

30 The importance of these flanking residues may explain the frequent occurrence 
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of disrupting mutations involving tyrosine at position 134. Further studies 
will be directed to determining if mutations of proline at position 134 and 
tyrosine at position 134 directly disrupt phosphorylation of serine at position 
133 or disrupt binding of CREB to CBP by some other mechanism. 
5 In addition, substitution of serine at position 133 with threonine 

also prevented the interaction of CREB and CBP. PKA protein substrates 
containing a phosphorylatable threonine residue are known to exist in nature 
(i.e., protein phosphatase inhibitor 1 and myelin basic protein), although they 
are less common than those with phosphorylatable serines [Zetterqvist, et al , 

10 in Peptides and Protein Phos phorylation , (ed.) Kemp, B.E. (CRC Press, Boca 
Raton, FL), pp. 172-187 (1990)], and synthetic peptides containing serine to 
threonine substitutions are relatively poor substrates for PKA phosphorylation 
[Zetterqvist, etal, supra). In the split-hybrid assay, however, it is unclear 
whether the mutation of threonine at position 133 disrupts the CREB-CBP 

15 interaction or if the mutant fails to become phosphorylated. Despite previous 
observations that serine residue at position 133 of mammalian CREB can be 
phosphorylated by a variety of protein kinases other than PKA, for example 
calcium/calmodulin-dependent protein kinase II and IV, protein kinase C, and 
a nerve growth factor (NGF)-activated CREB kinase [Sheng, et al, Neuron 

20 4:571-582 (1990); Sheng, et al., Science 252:1427-1430 (1991); Xie and 
Rothstein, J. Immunol. .154:1717-1723 (1995); Ginty, et al, Cell 77:1-20 
(1994)], it is not known which, if any, of these particular protein kinases are 
able to phosphorylate CREB at the serine at position 133 in yeast. The 
requirement for integrity of the entire RRXSX amino acid sequence, however, 

25 suggests that PKA is a reasonable candidate. 

The second category of mutations were identified adjacent the 
PKA phosphorylation motif. Amino acids isoleucine at position 137 and 
leucine at position 138 have previously been suggested to be important for 
hydrophobic interactions of CREB with CBP [Parker, et al , Mol Cell Biol 

30 16, 694-703 (1996)]. In this study, most of the mutations at position 137 and 



WO 98/13502 



PCT/US97/17276 



-40- 

138 converted these hydrophobic residues to polar amino acids. Thus, another 
possibility for the failure of these mutants to bind to CBP is that changes at 
these positions affect protein folding. Similarly, the mutation at position 141 
substituted a polar residue for the wild-type hydrophobic leucine, and this 
5 mutation also has the potential to affect protein folding. 

Substitution of the isoleucine at position 137 with a hydrophobic 
phenylalanine residue was found to disrupt the interaction between CREB and 
CBP as well. This result could have been the result of a detrimental effect on 
folding because of the steric hindrance associated with the comparatively 
10 larger size of phenylalanine. Alternatively, the proposed hydrophobic 
interactions between CREB and CBP are somewhat specific. Structural 
studies will be directed to definitively determine how these mutations affect 
binding. 

Perhaps most suiprising was the finding that critical mutations 
15 were restricted to a small region in the KID sequence, even though the 
relatively low affinity of phosphorylated CREB and CBP, determined to be 
between 250 and 400 nM by fluorescence anisotropy measurements [Kwok, 
et a/., Nature 370, 223-226 (1994)], is consistent with a restricted protein 
binding domain. The capability of the split-hybrid system to screen for a 
20 limited number of CREB mutants suggests that the system is highly specific, 
and thus, should be useful to identify mutations which disrupt interacts 
between other pairs of binding proteins. 
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Example 4 
Tax/SRF Binding Interaction 

To further investigate the feasibility of using the split-hybrid 

system to study protein-protein interactions, a pair of well characterized 

5 interacting proteins, SRF and Tax, was tested. Previous studies indicated that 

SRF and Tax interact in a standard yeast two-hybrid system suggesting that 

the proteins may be utilized in the split hybrid assay. Plasmid pLexA-SRF, 

containing a human SRF cDNA fused to the LexA DNA binding domain, was 

transformed into strain YI584 along with either pVP16-Tax or pVP16 alone. 

10 As with the pLexA-VP16 transformation, the yeast strains co-expressing 
LexA-SRF and VP16-Tax failed to yield any colonies on medium lacking 
histidine. In contrast, when LexA-SRF was co-transformed with a vector 
encoding the VP 16 activation domain alone, yeast growth occurred on medium 
lacking histidine, suggesting that TetR expression was not activated. These 

15 results demonstrated that a protein-protein interaction in the split-hybrid 
system can effectively prevent yeast growth and further indicated the utility 
of the assay for the study of various protein/protein interactions. 

Example 5 
Casein Kinase Binding Assays 

20 Hrr25 

In another example of use of the split hybrid assay to examine 
protein/protein interactions, Hrr25, a yeast casein kinase isoform, or human 
casein kinase I isoform 6, was employed in the assay with a known binding 
partner protein. 

25 Previous work using the two hybrid assay had identified three 

genes encoding proteins which interact with the yeast casein kinase isoform 
Hrr25. Proteins encoded by the genes were designated TTH1, TTH2, and 
HH3. The Hrr25 expression construct which was generated for use in the 
two hybrid assay was used in combination with the individual HH encoding 

30 constructs in the split hybrid assay to determine if interaction between the 
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binding partners would decrease growth of assay yeast cells on media lacking 
histidinc. Construction of the Hrr25 expression plasm id and isolation of 
plasmids encoding TDK proteins is discussed below. 

In order to identify genes encoding proteins that interact with 

5 5. cerevisiae HRR25 CKI protein kinase, a plasmid library encoding fusions 
between the yeast GAL4 activation domain and S. cerevisiae genomic 
fragments ("prey" components) was screened for interaction with a DNA 
binding domain hybrid that contained the E. coli lexA gene fused to HRR25 
("bait" component). The fusions were constructed in plasmid pBTM116 

10 which contains the yeast TRP1 gene, a 2\i origin of replication, and a yeast 
ADHI promoter driving expression of the E. coli lexA protein containing a 
DNA binding domain (amino acids 1 to 202). 

Plasmid pBTMl 16::HRR25 encoding the lexA::HRR25 fusion 
protein was constructed in several steps. The DNA sequence encoding the 

15 initiating methionine and second amino acid of HRR25 was changed to a Smal 
restriction site by site-directed mutagenesis using a MutaGene mutagenesis kit 
from BioRad (Richmond, California). The DNA sequence of HRR25 is set 
out in SEQ ID NO: 39. The oligonucleotide used for the mutagenesis is set 
forth below, wherein the Smal site is underlined. 



20 5 ' -CCTACTCTTAGG CCCGGG TCTTTTTA ATGTATCC-3 ' 

(SEQ ID NO: 37) 

After digestion with Smal, the resulting altered HRR25 gene was ligated into 
plasmid pBTM116 at the Smal site to create the IexA::HRR25 fusion 
construct. 

25 Interactions between bait and prey fusion proteins were detected 

in yeast reporter strain CTY10-5d (genotype =MA Ta ade2 trp 1-901 leu2- 
3 J 12 his 3-200 gal4 gal80 URA3::lexA op-lacZ.) [Luban, et aL, Cell 
73: 1067-1078 (1993)] carrying a lex A binding site that directs transcription of 
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lacZ. Strain CTY10-5d was first transformed with plasmid 
pBTMll6::HRR25 by lithium acetate-mediated transformation [Ito, et al 9 
J.BaaerioL 153: 163-168 (1983)]. The resulting transformants were then 
transformed with a prey yeast genomic library prepared as GAL4 fusions in 
5 the plasmid pGAD [Chien, et aL t Proc.NatLAcad.Sci (USA) 27:9578-9582 
(1991)] in order to screen the expressed proteins from the library for 
interaction with HRR25. A total of 500,000 double transformants were 
assayed for #-gaIactosidase expression by replica plating onto nitrocellulose 
filters, lysing the replicated colonies by quick-freezing the filters in liquid 

10 nitrogen, and incubating the lysed colonies with the blue chromogenic 
substrate5-bromo-4-chloro-3-indolyl-j3-D-galactoside(X-gal). j3-galactosidase 
activity was measured using Z buffer (0.06 M Na 2 HP0 4 , 0.04 M NaH 2 P0 4 , 
0.01 M KCi, 0.001 M MgS0 4 , 0.05 M j3-mercaptoethanol) containing X-gal 
at a concentration of 0.002% [Guarente, Meth. Enzymol. 707:181-191 (1983)]. 

15 Reactions were terminated by floating the filters on 1M Na 2 C0 3 and positive 
colonies were identified by their dark blue color. 

Library fiision plasmids (prey constructs) that conferred blue 
color to the reporter strain co-dependent upon the presence of the 
HRR25/DNA binding domain fusion protein partner (bait construct) were 

20 identified. The sequence adjacent to the fusion site in each library plasmid 
was determined by extending DNA sequence from the GAL4 region. The 
sequencing primer utilized is set forth below. 

5 ' -GG AATC ACTAC AGGG ATG-3 ' (SEQ ID NO: 38 ) 

DNA sequence was obtained using a Sequenase version II kit (US 
25 Biochemicals, Cleveland, Ohio) or by automated DNA sequencing with an 
ABI373A sequencer (Applied Biosystems, Foster City, California). 

Four library clones were identified and the proteins they encoded are 
designated herein as TTH proteins 1 through 4 for Targets Interacting with 
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HRR25-like protein kinase isoforms. The TTH1 portion of the TIH1 clone 
insert corresponds to nucleotides 1528 to 2580 of SEQ ID NO: 40; the TTH2 
portion of the TTH2 clone insert corresponds to nucleotides 2611 to 4053 of 
SEQ ID NO: 41 ; and the HH3 portion of the HH3 clone insert corresponds 
5 to nucleotides 248 to 696 of SEQ ID NO: 42. Based on DNA sequence 
analysis of the TEH genes, it was determined that TTH1 and TEH3 were novel 
sequences that were not representative of any protein motif present in the 
GenBank database (July 8, 1993). TTH2 sequences were identified in the 
database as similar to a yeast open reading frame having no identified 

10 function. (GenBank Accession No. Z23261, open reading frame YBL0506) 
When the various TIH proteins were used in the split hybrid 
assay in combination with Hrr25, it was observed that Hrr25/HH3 binding, 
previously determined to be weaker than Hrr25/TTH2 or Hrr25/T1H1 
interactions, produced the lowest level of growth in the transformed yeast 

15 strain. 

CKI5 

In order to isolate cDNAs which encode proteins that interact 
with CKI5, the two hybrid assay was performed using a LexA-CKIS fusion 
protein as bait. The coding region of CKI5 was subcloned into a BamWi site 
of pBTMl 16 and transformed into a yeast strain designated CKI6/L40 (MAT 
ahis3 A200trpl-901 leu2-3 112ade2LYS::(lexAop) 4 fflS3 URA3::(lexAop) 8 - 
lcZ GAL 4). CKIS/L40 was subjected to a large scale transformation with a 
cDNA library made from mouse embryos staged at days 9.5 and 10.5. 
Approximately 40 million transformants were obtained. Eighty-eight million 
were plated onto selective media lacking leucine, tryptophan and histidine. 
The ability of yeast transformants to grow in the absence of histidine 
suggested that there was an interaction between CKI6 and some library 
protein. 

In a second screening, interaction of the two proteins was 



20 



25 
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assayed by the ability of the interaction to activate transcription of j3- 
galactosidasc. Colonies that turned blue in the presence of X-gal were 
streaked onto media lacking leucine, tryptophan and histidine, grown up in 
liquid culture and pooled for isolation of total DNA. Isolated DNA was used 
5 to transform E. coli strain 600 which lacks the ability to grow on media 
lacking leucine. Colonies that grew were used for plasmid preparation and 
three classes of cDNA were identified. One class was closely related to a 
Drosophila transcription factor dCREBa. 

When CKI5/CREB interaction was examined in the split hybrid 
10 assay, cells were shown to grow on media containing histidine, but in the 
absence of histidine, growth was inhibited. Addition of small amounts of 
tetracycline to the cell culture restored the cell's ability to grow, suggesting 
that the interaction between CKI6 and CREBa was very weak. 

Example 6 

»5 AKAP 79 Binding Assays 

Expression Plasmid Utilized 

In still another example of use of the split hybrid assay to 
examine protein/protein interactions, an anchoring protein for the cAMP 
dependent protein kinase, AKAP 79, was utilized separately with binding 

20 partner proteins including the cAMP protein kinase regulatory subunit type I 
(RI), the cAMP dependent protein kinase regulatory subunit type II (RII) or 
calcineurin (CaN). Plasmids used in the assay were constructed as described 
below. 

A 1.3 kb Ncol/BamHl fragment containing the coding region 
25 of AKAP 79 was isolated from a pETl Id backbone and ligated into plasmid 
pASL Plasmid pASl is a 2 micron based plasmid with an ADH promoter 
linked to the Gal4 DNA binding subunit [amino acids 1-147 as described in 
Keegan et al. , Science, 231 :699-704 (1986)], followed by a hemagglutin (HA) 
tag, polyclonal site and an ADH terminator. The expressed protein was 



WO 98/13502 



PCT/US97/17276 



-46- 

therefore a fusion between AKAP 79 and the DNA binding domain of Gal4. 

Plasmids encoding RI, RII or CaN were isolated from a pACT 
murine T cell library in a standard two hybrid assay using the AKAP 79 
expression construct described above. Plasmid pACT is a leu2, 2 micron 
5 based plasmid containing an ADH promoter and terminator with the Gal4 
transcription activation domain n [amino acids 768-881 as described in Ma 
and Ptashne, Cell, 48:847-853 (1987)], followed by a multiple cloning site. 
RI, RII and CaN encoding plasmids were isolated as described below. 

A 500 ml SC-Trp yeast cell culture (OD^ = 0.6-0.8) was 
10 harvested, washed with 100 ml distilled water, and repelieted. The pellet was 
brought up in 50 ml LiSORB (100 mM lithium acetate, 10 mM Tris pH8, 1 
mM EDTA pH8, and 1 M Sorbitol), transferred to a 1 liter flask and shaken 
at 220 rpm during an incubation of 30 minutes at 30° C. The cells were 
pelleted, resuspended in 625 fil LiSORB, and held on ice while preparing the 
15 DNA. 

The DNA was prepared for transformation by boiling 400 pA 10 
mg/ml salmon sperm DNA for 10 minutes after which 500 /xl LiSORB was 
added and the solution allowed to slowly cool to room temperature. DNA 
from a Mu T cell library was added (40-50 /xg) from a 1 mg/ml stock. The 

20 iced yeast cell culture was dispensed into 10 Eppendorf tubes with 120 itl of 
prepared DNA. The tubes were incubated at 30°C with shaking at 220 RPM. 
After 30 minutes, 900 /xl of 40% PEG 3350 in 100 mM Li acetate, 10 mM 
Tris, pH 8, and 1 mM EDTA, pH 8, was mixed with each culture and 
incubation continued for an additional 30 minutes. The samples were pooled 

25 and a small aliquot (5 /xl) was removed to test for transformation efficiency 
and plated on SC-Leu-Trp plates. The remainder of the cells were added to 
100 ml SC-Leu-Tip-His media and grown for one hour at 30°C with shaking 
at 220 RPMS. Harvested cells were resuspended in 5.5 ml SC-Leu-Trp-His 
containing 50 mM 3AT (3-amino triazole) media and 300 /xl aliquots plated 

30 on 150 mm SC-Leu-Tip-His also containing 50mM 3 AT. Cell were left to 
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grow for one week at 30°C. 

After four days, titer plates were counted and 1 . lxlO 5 colonies 
were screened. Large scale 0~gal assays were performed on library plates and 
ten positive clones were isolated for single colonies. One of these colonies 
5 grew substantially larger than the rest, and was termed clone 1 1 . 1 . Sequence 
from clone 1 1.1 revealed an open reading frame 487 aa long which was 
correctly fused to the Gal-4 activation domain of pACT. The NIH sequence 
database was searched and the sequence was found to be closely homologous 
to the human calmodulin dependent protein phosphatase, calcineurin. 

1 0 Additional screening using pACT Mu T-cell library DNA and 

the pASI AKAP 79 bait strain was performed in order to identify other AKAP 
79 binding proteins by the protocol described above. Results from screening 
approximately 21 1 ,000 colonies gave one positive clone designated pACT 2-1 . 
Sequencing and a subsequent data base search indicated that the clone had 

15 91 % identity with rat type la regulatory subunit of protein kinase A (RI). 

The library was rescreened using the same AKAP 79 bait and 
fifteen positives were detected from approximately 520,000 transformants. Of 
these fifteen, eleven were found to be homologous to the rat regulatory 
subunit type I of PKA. Each of these isolates were fused to the 5' 

20 untranslated region of RI and remained open through the initiating methionine. 

Split Hybrid Analy^s 

In split hybrid analysis of AKAP79 binding interactions, a 
plasmid was first constructed for expression of a LexA:AKAP 79 fusion 
protein. An AKAP 79 coding region was excised from pAS AKAP 79 as an 
25 Ncol/BamHl fragment and inserted into pBTM 1 1 6 previously digested with the 
same enzymes. The resulting plasmid was designated pBTMl 16-AKAP79. 

Approximately 50,000 W303 yeast cells (strain YI665, see 
Table 1) in logarithmic growth were rinsed in media lacking histidine, 
suspended in 100 /tl to 200 pi of the same media, and plated on agar lacking 
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histidine (to select for absence of protein/protein interaction) and also lacking 
leucine and tryptophan (to select for transformants bearing expression 
constructs encoding AKAP 79 and its binding partner). When RH was 
employed as the AKAP 79 binding partner, 2 to 4 ptM tetracycline and 5 mM 
5 3AT were required to prevent the transformed host from growing under 
conditions where the expressed proteins interacted. 

Once conditions were established under which growth of the 
transformed host was eliminated, various candidate inhibitor compounds were 
separately added to the agar. It was presumed that if one of the candidate 

10 compounds was capable of disrupting AKAP 79 interaction with the binding 
partner protein, growth of the transformed host should be detectable in the 
vicinity of the compound on the agar. In the split hybrid assay wherein 
AKAP 79 and RH binding was examined, 2/d of a 30 mM stock solution of 
ICOS Compound 4273 in DMSO, 2 fil of a 10 mM stock solution of ICOS 

15 Compound 1062 in DMSO, and 2 DMSO alone (as a negative control) 
were spotted on to the plate which was incubated at 30°C for four to five 
days. For ICOS Compound 4273 a ring of growth was detected. 

In order to determine an IC 50 for an inhibitor identified as 
described above, alternative methods may be used. In one method, the 

20 inhibitor compound is added to the agar over a range of concentrations. 
Ideally, the compound is diluted to the point that host cell growth is 
essentially not detectable. 

In another method, a 96 well plate is used and the compounds 
of interest are serially diluted across one row of a 96 well plate, one 

25 compound per row. Media lacking histidine, tryptophan, and leucine is added 
(presuming that the expression plasmids encoding the binding partners also 
encode tip and leu proteins) along with the appropriately transformed host 
yeast strain. Tetracycline and 3AT are added at concentration previously 
determined to extinguish growth of the transformed host cell. After two to 

30 five days incubation at 30°C, the plate wells are read at approximately 600 
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nm using a plate reader. The concentration of inhibitor half way between zero 
and the lowest concentration that permits growth of the host cell to the level 
observed on media containing histidine is estimated to be IC 50 . 

A modification of this second method is particularly amenable 
5 for use in a high throughput screen of large numbers of candidate inhibitors. 
For example, rather than attempting to determine the IC 50 for a previously 
identified inhibitor, separate candidate inhibitors are added to each well of a 
96 well plate, preferably at more than one concentration, and host cell growth 
determined after several days incubation. Inhibitory activity of compounds 
10 identified in this manner is confirmed on an agar plate and the IC 50 
determined on 96 well plates, each assay as described above. 

Example 7 

General Application of The Split-Hybrid Screen 
In order to examine general utility of the split hybrid system, 
15 various experiments were conducted with binding proteins known to interact. 
In addition, a number of control experiments were included in order to 
determine if the effects observed with the known binding partners were in fact 
due to protein/protein interaction. 

A. Yeast Assay Strain ConsrmrHnn 
2 ^ Yeast transformants used in assays indicated below were 

derived from LYS2-deficient strains AMR69 (Mat a his3 lys2 leu2 trpl, 
URA3:LexA::LacZ) and AMR70 (Mat a his3 lys2 trpl leu2 y 
URA3:LexA::LacZ) [Hollenberg, et al., MoL Cell. Biol. 15, 3813-3822 
(1995); Chien, etai, Proc. Natl. Acad. Sci. (USA) 88:97578-9582 (1991); 
25 Fields and Song, Nature 340:245-246 (1989)]. Yeast were grown in YEPD 
or selective minimal medium using standard conditions [Sherman, F., et al., 
Methods in Yeast Genetics, Cold Spring Harbor Lab., Cold Spring Harbor, 
NY (1986): Methods in Enzymology, Vol. 194 Guide to Yeast Genetics and 
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Molecular Biology. Eds. Christine and Fink]. Derivatives of both AMR69 
and AMR70 strains lacking URA3 were first generated by streaking cells on 
synthetic media containing 5 mg/ml 5-fluoro-orotic acid (5FOA) [Methods in 
Enzymology, Vol. 194 Guide to Yeast Genetics and Molecular Biology. Eds. 
5 Christine and Fink]. Two URA3 deficient mutants were required due to the 
fact that these strains were subsequently mated. URA3-deficient colonies 
were confirmed by testing for uracil auxotrophy and deletion of the 
URA.LexA::LacZ locus was confirmed by an absence of j8-galactosidase 
activity assayed by standard methods. The mutant strains selected were 

10 designated 69-4 and 70-1. 

Targeted integration of pRS306/8xLex Aop/TetR was carried out 
by transforming [Hollenberg, etaL, MoL Cell. Biol. 15, 3813-3822 (1995)] 
the 69-4 strain with plasmid linearized at a unique Ncol site. The reporter 
gene construct was constructed using parental plasmid pRS306 which encodes 

15 URA3 as a selectable marker. Stably integrated plasmid thereby permitted 
selection on media lacking uracil. The positive uracil prototrophic strains 
were examined by Southern analysis to confirm insertion of the plasmid 
sequences. 

Targeted integration of pRS303/2xtetop-LYS was carried out 
20 by transformation [Hollenberg, et aL, supra] of strain 70-1 with plasmid 
linearized at a unique Hpal site. The resulting lysine prototrophic strains 
were examined by Southern analysis to confirm insertion of the plasmid DNA. 

The AMR69 derivative strain (MAT a) containing the 
pRS303/2xtetop-LYS insertion was mated with the AMR70-derivative strain 
25 (MAT a) containing pRS306/8xLexAop/TetR and mated cells were selected 
on media lacking both lysine and uracil. Single colonies were grown up and 
tested for the ability to grow on media lacking histidine. The resulting strain 
was designated YI584. In instances where yeast strains were transformed with 
other reporter gene pair combinations, the strains were uniquely designated. 
30 Yeast bearing integrated reporter gene constructs were 
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subsequently transformed [Hollenberg, et aL, supra] with plasmids encoding 
chimeric binding protein. Plasmids encoding the LexA DNA binding region 
were generally derived from parental plasmid pBTM116 which also encodes 
TRP1 as a selectable marker. Plasmids encoding the VP16 transactivating 
5 domain were generally derived from parental plasmid pVP16 which also 
encodes LEU2 as a selectable marker. Yeast cells which were successfully 
transformed with the four exogenous plasmids were therefore selected by an 
ability to grow on media lacking lysine, uracil, tryptophan, and leucine. 
Plasmids encoding various binding proteins were transformed into the yeast 
10 assay strain as indicated below. 

B. Liquid Assay 

After three days growth at 30°C on selection media as 
described above, a pool of colonies from each transformation was collected 
and diluted in 5 ml selective media. The mixture was vortexed and 
15 immediately sonicated for ten seconds. Cells in the resulting suspension were 
counted and seeded at 1000 cells/ml in selective media, 2 ml per 15 ml tube. 
Tetracycline, 3AT, and histidine were included as determined appropriate by 
the method described above. Each aliquot of cells was incubated with shaking 
for two days at 30 °C and cell density measured at OD 600 . 

20 C. Characterization of the Assay 

The utility of the split-hybrid assay was first determined using 
well characterized binding proteins and various controls. 

In an initial study, YI584 cells were transformed with plasmids 
pLexA-VP16 and pLeu. While the expressed proteins from the two plasmids 
25 do not interact, pLexA-VP16 encodes a fusion protein containing the VP16 
activation domain fused directly to LexA which contains a DNA binding 
domain. The chimeric LexA-VP16 protein is a strong transactivator for a 
promoter containing LexA operators. Plasmid pLeu is essentially a blank used 



WO 98/13502 



PCT/US97/17276 



-52 - 

as a control co-transformation plasmid. 

Yeast transformed with the LexA-VP16 plasmid were able to 
express TetR protein as indicated by gel shift analysis using a let operator 
oligonucleotide. In addition, the cells were unable to grow on media in the 
5 absence of histidine. Combined, these observations suggested that 
overexpressed TetR protein was capable of binding to let operators and 
preventing the expression of HIS3. The transformed yeast grew on plates 
containing histidine, further indicating that overexpression of TetR did not 
have a toxic effect on the assay cells. 
10 The results were consistent with previous observations and 

supported the earlier suggestion that activation of TetR expression, either 
through a single transcription factor or association of individual transcription 
factor domains, is capable of preventing assay cell growth on media lacking 
histidine, presumably by eliminating HIS3 production. 

15 Example 8 

Split-Hybrid Assay With Weakly Interacting Binding Proteins 

Protein/protein interaction was examined in the split-hybrid 

assay to determine utility of the system using two fusion proteins known to 

interact weakly. In this instance, the binding proteins were a 283 amino acid 

20 fragment of a cAMP regulatory binding protein (CREB283) fused to LexA 
and a fragment of the CREB binding protein consisting of the CREB binding 
domain (CBD) fused to VP16. 

In this assay, yeast strain YI584 described above was employed 
and transformation carried out as previously described. In a first assay, 

25 plasmids pLexA-CREB and pVP16-CBD were transformed into the cells and 
cell growth was observed in the absence of histidine in the media. Expression 
of the fusion proteins was confirmed by Western blotting. Attempts to 
decrease cell growth by titration with 3AT were unsuccessful in that the 
concentration of 3AT required to reduce growth in cells transformed with 
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pLexA-CREB and pVP16-CBD also eliminated growth in cells transformed 
with pLexA-CREB and the control plasmid pVP16. 

In light of these results, two alternative approaches were taken 
in order to permit study of binding proteins wherein the interaction is 
5 relatively weak. Under the assumption that the system was failing at the level 
of TetR transcription, alternative approaches were taken in attempts to amplify 
the TetR effect on expression of HIS3 gene. To achieve this end, assay cells 
were transformed with reporter constructs which encoded multiple let operator 
sequences upstream from the HIS3 gene. In the second approach, the H1S3 

10 promoter used to drive expression of the TetR gene was replaced with the 
stronger alcohol dehydrogenase (ADH) promoter. 

In YI596 cells wherein the ADH promoter replaced the HIS3 
promoter to drive TetR expression, transformation with plasmids pLexA- 
CREB and pVP16-CBD showed substantially decreased growth on his" media 

15 as compared to that in assay strain YI592 wherein the HIS3 promoter was 
used to drive TetR expression. However, in cells transformed with plasmids 
pLexA-CREB 341-M1 and pVP16-CBD, no decrease in assay cell growth was 
detected on media lacking histidine. These results indicate that incorporation 
of the ADH promoter to drive TetR expression may be more useful in studies 

20 involving binding proteins that have low affinity. 

When assay strains were utilized which incorporated plasmids 
wherein expression of the HIS3 gene was driven by multiple copies of the tet 
operator, transformed cell lines did not grow well enough to indicate potential 
utility in subsequent assays. 
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Example 9 
General Assay Methods 

A. "Fine Tuning'' 

In instances where either of the test fusion proteins possesses 
5 intrinsic capacity for transcriptional activation, TetR will be expressed and 
growth of the assay strain media lacking histidinc will be depressed 
proportional to the level of TetR expression. In order to restore growth of 
these cells to approximately the level observed on media containing histidine, 
the initially transformed assay yeast strains arc grown in the presence of 

1 0 increasing concentrations of tetracycline which binds to the TetR gene product 
and prevents TetR binding to the tef operator. Precise titration of expressed 
TetR with tetracycline, only to the point that growth of the assay strain is 
restored to the level detected in the presence of histidine, permits detection of 
subsequent decreased growth of the assay strain following increased TetR 

15 expression resulting from interaction of the test binding proteins. The 
empirically determined tetracycline concentration is therefore employed to 
increase "signal-to-noise" ratios under assay conditions. 

After an appropriate tetracycline concentration has been 
determined for each of the candidate assay strains, the cells are transformed 

20 with the second plasmid encoding the second fusion binding protein. As 
before, growth of each candidate assay strain is examined on media in the 
presence and absence of histidine. A desirable yeast assay strain is chosen 
which shows vigorous growth in the presence of histidine and negligible 
growth on media lacking histidine (indicative of the expected protein/protein 

25 interaction and resultant decreased expression of HISS). 

In instances where binding between the two test proteins is 
comparatively weak, TetR expression may not be sufficiently increased to 
abolish HIS3 expression and cells expressing the resultant low levels of H1S3 
will still grow on media which lacks histidine. Cells which show this low 

30 level of viability are grown in the presence of increasing concentrations of 3- 
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aminotriazole (3 AT), a competitive inhibitor in the histidine synthesis 
pathway, in order to reduce cell growth to negligible levels when plated on 
media lacking histidine. As with titration of TetR with tetracycline, addition 
of 3AT to the media is designed to increase the signal-to-noise ratio by 
5 providing significant changes in growth in the presence and absence of 
histidine in the media. 

In a practical application of the methods for fine tuning, binding 
between CREB and the CREB binding protein (CBP) is illustrative. Growth 
of the yeast strain YI584 transformed with pLexA-CBD, encoding the CREB 

10 binding domain (CBD) of CBP, and pVPl6-CREB or pLexA-CBD and the 
control plasmid pVP16 was substantially decreased and virtually 
indistinguishable growth rates were detected in both instances on media 
lacking histidine. This observation indicated that the LexA-CBD protein 
product possessed sufficient transactivating capacity to eliminate fflS3 

15 production. In order to distinguish growth differences between assay cells 
transformed with either pVP16 and pVP16-CREB, increasing amounts of 
tetracycline were added to the media lacking histidine. 

In both transformants, tetracycline was able to relieve growth 
repression in a dose dependent manner, and at increasing concentrations of 

20 tetracycline, the difference in growth between the two colonies was 
increasingly magnified, with the most distinct growth difference observed 
following addition of tetracycline at 10 /ig/ml. Addition of tetracycline was 
therefore able to overcome the intrinsic transactivating capability of the LexA- 
CBD fusion protein. 

25 Because the ultimate use of the split-hybrid system is for 

structure-function studies, mutagenesis studies, drug identification and library 
screens, it is important to minimize background growth that might be confused 
with disrupted protein-protein associations. This can be accomplished by the 
addition of 3AT, a competitive inhibitor of the HIS3 gene product. For 

30 instance, in the presence of 10 ^g/ml of tetracycline, the yeast strain 
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transformed with pLexA-CBD and pVPI6-CREB still conferred approximately 
12% growth of that observed in the presence of his + media. To diminish this 
background, increasing concentrations of 3AT were added to the media in the 
presence of 10 ng/ml of tetracycline. At the 3AT concentration of 0.25 mM, 
5 the growth of the yeast strain expressing LexA-CBD and VPI6-CREB was 
below 5%, while the growth of the control strain was still maintained at 70% 
of control levels. These results indicate that split-hybrid system can be 
modulated by 3AT in addition to tetracycline in oixler to effectively increase 
the signal-to-noise ratio. 

10 B. Preparation of veast extracts 

In order to assess the utility of various plasmids to function in 
the split-hybrid assay, a number of control experiments can be employed 
which lend insight into expression of a desired protein from the transformed 
plasmid. For example, standard immunological methodologies, i.e., 

15 immunoprecipitation, ELISA, etc., can be used to determine to the extent to 
which a desired protein is expressed. Similarly, a variation of the gel shift 
assay (discussed immediately hereafter) can be used to determine both if a 
protein is expressed and if the expressed protein is capable of DNA binding. 
In each of these control assays, a yeasl extract is required which can be 

20 prepared as follows. 

Extracts were prepared as described by Uppaiuri and Towle 
[Mol. Cell. Biol. 15:1499-1512 (1995)] and were used for electrophoretic 
mobility shift assays as discussed below. The yeast cells transformed with 
pLexA-VP16 were grown in 100 ml of selective synthetic medium lacking 

25 uracil, tryptophan, and lysine to a density of A^ = 1 . Cells were harvested 
and washed with 5 ml of EB (containing 0.2 M Tris-HCl, pH 8.0, 400 mM 
(NH 4 ) 2 S0 4 , 10 mM MgCl 2 , 1 mM EDTA, 10% glycerol, and 7 mM 0- 
mercaptoethanol). Cells were transferred to microcentrifuge tubes and 
collected by centrifugation. After resuspending in 200 /d EB containing 1 
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mM phenylmethylsulfonyl fluoride (PMSF), l/xg/ml leupeptin, and 1/xg/ml 
pepstatin, a one-half volume of glass beads was added. The suspension was 
frozen in a -80°C freezer for 1 hour and thawed on ice. Thawed cells were 
vortexed at 4°C for 20 minutes, after which an additional 100 pA EB was 
5 added, and cells were left on ice for 30 minutes. The suspension was 
centrifuged for 5 minutes, the supernatant was transferred to a new tube which 
was centrifuged for 1 hour in a microcentrifuge. The supernatant was then 
made to 40% with (NH 4 ) 2 S0 4 and gently rocked for 30 minutes. After a 10 
minute centrifugation, the pellet was resuspended in 300 fi\ of 10 mM 
10 HEPES, pH 8.0, 5 mM EDTA, 7 mM /5-mercaptoethanol, 1 mM PMSF, 1 
/ig/ml leupeptin, and 1 /xg/ml pepstatin, and 20% glycerol. The resulting 
suspension was dialyzed against the same buffer, and aliquots were stored at - 
80°C. 



C. Electrophoretic mobility shift assays 

15 Electrophoretic mobility shift assays were performed as described by 

Shih and Towle \J. Biol. Chem. 267:13222-13228 (1992)]. Double-stranded 
let operator oligonucleotides were prepared by combining equivalent amounts 
of complementary single-stranded DNA (SEQ ID NOS: 7 and 8) in a solution 
containing 50 mM Tris-HCI, pH 8.0, 10 mM MgCI 2 , and 50 mM NaCl 2 , 

20 heating the mixture to 70°C for 10 minutes, and then cooling to room 
temperature. The annealed oligonucleotides were labeled by filling in 
overhanging 5 ' ends using the Klenow fragment of E. coli DNA polymerase 
I with [or- P]dCTP. Binding reactions were carried out in 20 nl containing 
10 mM Tris-HCI, pH 7.5, 50 mM NaCl, 1 mM EDTA, 1 mM dithiothreitol, 

25 5% glycerol, and 2 mgof poly[d(I C)]. A typical reaction contained 20,000 
cpm (0.5-1 ng) of end-labeled DNA with 3-5 ng of yeast extract. Following 
incubation at 22°C for 30 minutes, samples were separated on a 4.5% 
nondenaturing polyacrylamide gel containing 50 mM Tris, 384 mM glycine, 
and 2 mM EDTA, pH 8.3. For competition binding experiments, the 
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conditions were exactly as above except that specific and nonspecific 
competitor DNAs were included in the binding mixture before the yeast 
extract was added. The concentration of tetracycline, a competitive inhibitor 
of TctRJtet operator binding, was 1 fiM when utilized. 



5 Example 10 

Application of the Split-Hybrid Assay to Identify Agents 
That Prevent Receptor Desensitization and Drug Tachyphylaxis 

Over half of the drugs that are used clinically affect the function 

of seven transmembrane receptors. Although many of the characteristics of 

10 these receptors are distinct, two general features appear to be conserved. One 
is the ability to signal through dissociation of hetenotrimeric G proteins. The 
second is the capacity to lose responsiveness to ligand binding in a process 
termed desensitization which is mediated by receptor phosphorylation and the 
subsequent binding of factors that recognize the phosphorylated state of the 

15 receptor which prevents continued signaling. Desensitization results in an 
intrinsic limitation to drug action imposed by the action of the drug itself, i.e. , 
activation of a receptor by a hormone or drug initiates mechanisms that 
prevent subsequent responses to repeated administration of the same agent. 
The coupled mechanisms of activation and deactivation together have been 

20 termed "homologous desensitization, " while the inability of a drug to maintain 
its efficacy is known as "tachyphylaxis." Even though the mechanisms 
underlying homologous desensitization have been worked out in great detail 
over the past few years, there are currently no useful pharmacological 
approaches available that prevent the inactivation mechanism. 

25 The potential clinical utility of agents that could prevent or 

modulate drug desensitization is enormous. Four examples where therapy is 
limited by the inability of receptors to maintain responsiveness to drugs 
include: (i) asthma wherein desensitization of airway adrenergic receptors 
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renders epinephrine treatment ineffective after a period of hours; (ii) 
congestive heart failure wherein desensitization of adrenergic and VIP 
receptors, coupled with an elevation of the 0 adrenergic receptor kinase 
(/3ARK), prevents the inotropic effects of endogenous regulatory hormones; 
5 (iii) Parkinson's disease, wherein dopamine receptor desensitization limits the 
usefulness of agents like L-Dopa; and (iv) chronic pain wherein tolerance 
results from opiate receptor desensitization. Indeed, it is difficult to conceive 
of a pharmacological modality in use today that is not limited in its 
effectiveness by the phenomenon of desensitization. 
10 The biochemical basis for G protein-coupled receptor desensiti- 

zation involves three classes of proteins including arrestins, kinases and G- 
proteins, all of which have been cloned [Lefkowitz, Nature Biotechnology 
14:283-286 (1996)]. Following activation of a seven transmembrane receptor, 
a region is phosphorylated by one or more G protein-coupled receptor kinases 

15 (known as GRKs 1-6). For example, in the /?-adrenergic receptor (/JAR) and 
rhodopsin, the cytoplasmic tail is phosphorylated [Premont, et aL, J. biol. 
Chenu 269:6832-6841 (1994); Freedman. etal, J. Biol. Chem. 270:17953- 
17961 (1995); Palczewski,*?/ aL, J. Biol Chem. 266:12949-12955 (1991); 
Palczewski, etaL, J. BioL Chenu 270:15294-15298 (1995)] while in the m2 

20 muscarinic receptor, the third cytoplasmic loop is phosphorylated [Nakata, et 
aL, Eur. J. Biochenu 220:29-36 (1994)]. The best characterized members of 
the family of G protein receptor kinases are the 0AR kinase (0ARK) and 
rhodopsin kinase which are both membrane-associated. While rhodopsin 
kinase contains an intrinsic membrane targeting signal [Inglese, et aL , Nature 

25 359:147-150 (1992)], 0ARK appears to be targeted to the membrane by 
association with G protein 0y subunits [Pitcher, et aL , Science 257: 1264-1267 
(1992); Inglese, et aL, Nature 359:147-150 (1992)]. Once the substrate 
receptor for each kinase is activated, presumably by ligand binding, the kinase 
associates and phosphorylates serine and threonine residues on the receptor. 

30 The phosphorylated receptor then becomes a binding target for one or more 
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other proteins. In the case of /3AR, for example, phosphorylation allows 
binding of arresting which prevents association with G proteins and promotes 
receptor sequestration and desensitization. Using the 0AR as an exemplary 
desensitization model, it becomes apparent that multiple steps in the pathway 
5 appear to provide potential points of regulation each of which is amenable to 
the split-hybrid screen to identify molecules that can block the overall 
desensitization pathway. Specifically in the case of /JAR, the split hybrid 
system can be used to identify small molecules that: (i) prevent interaction 
between 0ARK and the G protein 0 subunit; (ii) inhibit 0ARK activity; and 
10 (iii) disrupt the 0ARK:arresting complex. 

A. Plasmid Construction s 

The study of G-protein receptor kinases in the split-hybrid 
system involves three or more recombinant proteins or two or more 
recombinant proteins and a recombinant peptide library. In the split-hybrid 

1 5 system discussed above, two yeast primary expression plasmids are employed: 
pBTMU6 [Bartel et ai, Cellular Interactions in Development: a Practical 
Approach, (ed) Hartley, IRL Press, Oxford, pp. 153-179 (1993)], which 
encodes the LexA-fusion protein and the TRPI selectable marker, and pVP16 
[Hollenberg etaL.Mol. Cell. Biol., 15:3813-3822 (1995)], which encodes; the 

10 VP16-fusion protein and the LEU2 selectable marker. In order to study 
interactions involving more than two recombinant proteins in the split-hybrid 
system, however, additional selectable markers are employed. Construction 
of additional yeast expression plasmids which are used to examine interact ions 
between more than two binding proteins is discussed below. 
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1. Plasmid pDRM 

A DNA fragment comprising the ADH promoter and LexA 
sites, the TetR encoding gene, the nuclear localization signal, and the ADH 
terminator sequence are removed from pRS306/4xLexAop/ADH::TetR with 
5 Sad, blunt-ended, and digested with &ifl. The fragment is isolated and 
Hgated into pRS303/2xtetop-LYS2 which has previously been digested with 
Notl y blunt-ended, and digested with Sail. The resulting plasmid, designated 
pDRM, is integrated into the LYS2 locus in the yeast genome as described 
above, and the resulting strain designated YIDRM. Placing the repressor gene 
10 and selectable marker reporter gene in the LYS2 locus allows ERA3 to be used 
a selectable marker. 

2. Plasmid pRSURA3 

A modified version of the pRS306 vector [Sikorski et aL, 
Genetics, 122:19-27 (1989)1 containing the URA3 selectable marker gene is 
1 5 also used to encode additional recombinant proteins in the split-hybrid system. 
The plasmid, pRS426, has the 2 micron origin of replication inserted into a 
unique Aail site of pRS306. Plasmid pRS426 is further modified in the 
following manner: 

(i) The ADH promoter sequence is amplified by PCR from 
20 BTM116 using primers which incorporate into the amplification product the 

DNA sequence encoding the SV40 large T antigen nuclear localization signal 
(NLS) and an initiating ATG sequence 3' to the ADH promoter. The ADH 
promoter/NLS/ATG sequence is inserted into the polylinker of pRS426. 

(ii) The ADH terminator sequence is amplified by PCR from 
25 BTM1 16 using primers which incorporate into the product a DNA sequence 

encoding an antibody tag, for example, FLAG, hemagglutinin protein (HA), 
or thioredoxin (Thio) (FLAG, HA, and Thio antibodies are available through 
Santa Cruz Biotechnology, Santa Cruz, CA) and DNA sequences encoding 
stop codons in all three frames to the 5' end of the ADH terminator sequence. 
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The antibody tag/stop codon/ADH terminator sequence is inserted into the 
polylinker of pRS426. 



3. Plasmid p RSADF.7 

PCR is used to engineer unique restriction sites, including for 
example, Bgia, Eco47W, MM, Nhel, and Sphl, immediately adjacent the 5' 
and 3' ends of the URA3 cassette in pRSURA3. The URA3 cassette is 
digested from pRSURA3 and replaced with the ADE2 cassette which is 
amplified by PCR. 

4. Plasmid pBTM116/An4 

A fragment containing the ADH promoter, polylinker, and 
ADH terminator is digested from pAD4 [Young etal. , Proc. Nat'l. Acad. Sci. 
(USA), 56:7989-7993 (1989)] with BamHl, blunt-ended and inserted into the 
blunt-ended Pvul site of BTM1I6 as described [Keegan et ai, Oncogene, 
72.1537-1544 (1996)], and the resulting vector designated pBTM116/AD4. 
PCR is also used to engineer a nuclear localization signal 3* of the ADH 
promoter as described above. This vector contains the TRP1 selectable 
marker and can encode two recombinant proteins: (i) a LexA-fusion protein 
and (ii) a protein expressed from the pAD4 region of the vector. 



B /3ARK and G Protein ft gu bunit Binding 

In a first application of the split hybrid assay, disruption of 
binding between the carboxy-tenninal domain of BASK, containing the 
pleckstrin homology (PH) domain, and the G protein B subunit (GB.) is 
examined. Previous work indicates that the PH domain of 0ARK interacts 
directly with the By subunits of G proteins [Pitcher, J. A., et al. Science 
257: 1264-1267 (1992) and Touhara, K. etal., J.Biol. Chem. 269: 102 17-10220 
(1994)]. Consistent with this observation is work by Pumiglia, et al. 
[Pumiglia, K.M., et ai, J.Biol.Chem. 270:14251-14254 (1995)] which 
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indicates that G/3 2 interacts with Rafl in yeast and that the interaction is 
disrupted by jSARK in vitro. 

A DNA fragment containing the carboxy-terminal 222 amino 
acids (residues 467 to 689) of 0ARK1, which includes the PH domain, is 
5 amplified by PCR from bovine 0ARK1 [Pitcher et al t Science, 257:1264- 
1267 (1992)] and the gel-purified amplification product is inserted into 
pBTM116. The resulting plasmid is designated LexA-COOH-/3ARK. A DNA 
fragment containing the entire coding sequence of G0 2 [Fong et al. , Proc. 
Nat'L Acad. Sci. (USA). 54:3792-3796(1987)] is PCR amplified from pGEM- 

10 1 lZf(-)G/? 2 pnigez-Lluhi et al. , JBC, 267:23409-23417 (1992)] and the gel- 
purificd amplification product inserted into pVP16. The resulting plasmid is 
designated pVP16-G)3 2 . PCR is used in a similar manner to clone the 
carboxy-terminal domain of jSARK into pVP16 and G£ 2 into pBTM116. 

0ARK and Gj8 2 binding is first examined in the two-hybrid 

15 system to determine if expression of either binding partner as a fusion protein 
in yeast affects protein/protein interaction. Binding of the two proteins is then 
examined in the split hybrid assay in order to determine if protein/protein 
interaction is capable of abolishing growth of the assay yeast strain. As 
above, addition of tetracycline and/or 3-aminotriazole required to maximize 

20 the difference in growth in the presence and absence of histidine is empirically 
determined. 

Split-hybrid yeast strains containing jSARK and G/3 2 subunits 
are used to screen libraries of small molecules. Several types of small 
molecule libraries can be examined in the split-hybrid assay, including for 

25 example, chemical libraries, libraries of products naturally produced by 
microorganisms, animals, plants and/or marine organisms, combinatorial, 
recombinatorial, peptidomimetic, multiparallel synthetic collection, protein, 
peptide and polypeptide libraries. A library of small peptides can be cloned 
into pRSURA3 as described [Yang et aL, Nuc. Acids Res., 23; 11 52- 11 56 

30 ( 1 995) and Colas et al. , Nature, J5O. 548-550)] . Peptides corresponding to the 
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carboxy-terminus of 0ARK or other GRKs which have previously been shown 
to block calcium channel desensitization in intact neurons, presumably by 
blocking 0ARK and G0 2 binding and subsequent trafficking of 0ARK to the 
cellular membrane [Diverse-Pierluissi, a ai, Neuron 16:579-585 (1996)] can 
be identified in such a screen. Further, it is important to show that the 
molecules identified through the split hybrid selection affect j3ARK:G0 
interaction as opposed to, for example, tetracycline analogues identified in the 
screen that would not be useful to specifically modulate 0ARK/G0 2 binding. 



B. Identification of fiARK Inhihitnr* 

In a second approach, agents that directly inhibit /3ARK 
function are identified in a modification of the split-hybrid system. While 
identification of specific 0ARK inhibitors may be difficult, preliminary data 
from split hybrid assays using CREB/CBP binding partners indicates that the 
system can be used to identify serine kinase inhibitors. The serine kinase 
results also suggest several approaches can be employed in attempts to 
overcome potential problems in identifying 0ARK inhibitors. 

Briefly, binding between the phosphorylated G-protein coupled 
receptor (P GR) and arresting is examined first in the standard two hybrid 
assay, followed by identification of inhibitors of P-GR/arresting binding in the 
split hybrid assay. For these studies, fragments of three G protein-coupled 
receptors are examined: the carboxy-terminal tail of 0 2 AR and the third 
cytoplasmic loop of the ml muscarinic receptor. A DNA fragment containing 
the carboxy-terminal tail of the /2 2 AR (amino acids 330 to 413) is PCR 
amplified [Kolbilka et ai, JBC. 262:7321-7327 (1987)] and the gel purified 
product inserted into pBTM116/Ad4 to produce a LexA-0 2 AR fusion gene. 
The resulting plasmid is designated pBTM-|8 2 AR/AD4. A DNA fragment 
containing the third cytoplasmic loop of the human m2 muscarinic receptor 
(nucleotides 268-324) is amplified from pGEX-I3m2 [Haga et ai, JBC, 
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269: 12594- 12599 (1994)] by PCR and cloned into pBTMl 16/Ad4 creating a 
LexA-m2 fusion gene. The resulting plasmid is designated pBTM-m2/AD4. 
The entire bovine jSARKl coding sequence [Benovic et aL, Science, 246:235- 
240 (1989)] is PCR amplified and cloned into the polylinker region originating 
5 from AD4 in pBTM-/J 2 AR/AD4 and pBTM-m2/AD4. The resulting plasmids 
are designated pBTM-/3 2 AR/AD4-0ARK and pBTM-m2/AD4-/8ARK, 
respectively. PCR is used to amplify the DNA fragment containing bovine 
j3arresting-l (amino acids 1 to 437) [Lohse, et ai t Science, 24S.1547-1550 
(1990)]. This fragment is inserted into pVP16 and is designated pVP!6- 

10 /Jarresting-1. PCR is used to amplify the DNA fragment containing rat 
/Jarresting-2 (amino acids 1 to 428) [Attramadal, et aL, JBC, 267:17882- 
17890 (1992)] which is inserted into pVP16 to give plasmid pVP16-0arresting- 
2. A PCR strategy is also used to clone arresting into the pBTMl 16/AD4- 
0ARK plasmid and the /JAR and m2 fragments into pVP16. As above, the 

15 yeast split-hybrid YIDRM strain is transformed with the P-GR-arresting along 
with peptide libraries (cloned into pRSURA3) or grown following 
transformation in the presence of combinatorial drug libraries. 

Inhibitors identified in the split hybrid assay should effect 
disruption of protein/protein interaction either by: (i) inhibiting 0ARK 

20 phosphorylation of the receptor, thus preventing recognition of the receptor 
by arresting, or (ii) by physical disruption of binding between the receptor and 
arresting. Agents that allow yeast growth for trivial reasons, i.e. , tetracycline 
analogues, can be easily identified through use of simple controls. 

A first potential problem to overcome in this study is that 

25 cytoplasmic /3ARK enzyme must be targeted to the substrate receptor and, 
once targeted, must phosphorylate the receptor at appropriate sites. In normal 
cells, (3y association serves to target /3ARK to the cell membrane; the 0 
subunit binds to both the 0ARK PH domain and the isoprenylated y subunit 
in association with the membrane. One possible means to encourage the 

30 necessary specific interactions is to target the binding components in the assay 
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by tagging the proteins with nuclear localization signals, i.e., PARK, the 
receptor cytoplasmic tail, and arresting, to the nucleus. The plasmids 
proposed for the study of the P-GR-arresting interaction all contain nuclear 
localization signal sequences adjacent to recombinant gene sequence. 
5 A second problem is somewhat more difficult to approach. The 

current model is that receptors must be activated by ligand binding before 
being phosphorylated by 0ARK, i.e., targeting of 0ARK via 0y is not 
sufficient for receptor phosphorylation. There are two possible explanations 
for this requirement. The first is that phosphorylation sites on the receptor are 

10 masked in the absence of ligand and ligand binding causes a conformational 
change which "unmasks" the phosphorylation sites. If this is the case, a 
fragment of the receptor containing the immediate phosphorylation site may 
be used as the 0ARK target. However, although peptides representing 
portions of the 0AR cytoplasmic tail can be phosphorylated by 0ARK, the K m 

15 for the phosphorylation reaction is poor, suggesting that the kinase may 
require some other part of the receptor for binding and that the unmasking of 
this binding site by agonist is a critical step. 

This problem is addressed in two ways. In the first, the m2 
muscarinic receptor is used in place of the 0AR in view of previous results 

20 which indicate that the m2 protein is a good substrate for 0ARK. The thiid 
cytoplasmic loop of the m2 receptor serves as both the binding site and 
phosphorylation site for kinase and which should allow use of a LexA/m2 
receptor third cytoplasmic loop fusion gene as one component in the screening 
system. 

25 An alternative approach is to artificially mimic the activated 

state of the receptor. Haga, ei al. {J. Biol. Chem. 269:12594-12599(1994)] 
have shown that the activity of 0ARK can be stimulated in vitro in the 
presence of mastoporan, a bee venom peptide. Mastoporan is believed to 
mimic the cytoplasmic face of an activated receptor and has been shown to 

30 increase the affinity of 0ARK for a GST-m2 receptor fusion protein by over 
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four orders of magnitude. The same effect can be seen by using peptides 
representing the flanking regions of the ni2 third cytoplasmic loop. Thus, 
mastoporan should also activate 0ARK in the two-hybrid yeast strains, allow 
phosphorylation of the receptor fusion protein, and promote interaction with 
5 arresting. If mastoparan is needed, oligonucleotides containing the coding and 
non-coding nucleotide sequences of the 14-mer peptide (INLKALAALAKKIL- 
NH 2 , SEQ ID NO: 43) are annealed and ligated into prSADE2. The yeast 
split-hybrid strain YIDRM is transformed with pBTM-0AR (or m2)/AD4- 
0ARK, pVP16-arresting, pRSADE2-mastoparan, and a pRSURA3-peptide 
10 library or combinatorial drug library. 

Numerous modifications and variations in the invention as set 
forth in the above illustrative examples are expected to occur to those skilled 
in the art. Consequently only such limitations as appear in the appended 
claims should be placed on the invention. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Hoekstra, Merl F . 

(ii) TITLE OF INVENTION: Methods to Identify Compounds For 
Disrupting Protein/Protein Interactions 

(iii) NUMBER OF SEQUENCES: 43 

<iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Marshall, O' Toole, Gerstein, Murray & Bo- 

<B) STREET: €300 Sears Tower, 233 South Wacker Drive 

(C) CITY: Chicago 

(D) STATE: Illinois 

(E) COUNTRY: United States of America 

(F) ZIP: 60606-6402 

<v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
iC) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY /AGENT INFORMATION * 
(A) NAME : 

<B) REGISTRATION NUMBER: 

(C) REFERENCE /DOCKET NUMBER: 27866/33424 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312/4 74-6300 

(B) TELEFAX: 312/474-0448 

(C) TELEX: 25-3856 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
TTGGTG AG CG CTAGGAGTCA CTGCCAG 
(2) INFORMATION FOR SEQ ID NO: 2: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

TATACTCTAT CAATGATAGA GTAATTCATT ATGTGATAAT GCC 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATTACTCTAT CATTGATAGA GTATATAAAG TAATGTGATT TC 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
AATTCTG CT A GCCTCTGCAA AGC 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CGCACGCGTC GAAGAAATCA CATTACTTTA TATA 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO:6: 
CGCACGCGTA TACTAAAAAA TGAGCAGGCA AG 
(2) INFORMATION FOR SEQ ID NO: 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CGCGTACTCT ATCATTGATA GAGTA 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
ATGAGATAGT AACTATCTCA TGCGC 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGCGTACTCT ATCATTGATA GAGTCTAGAC TCTATCAATG ATAGAGTA 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCGACGCGTG CATGCCGTCT TCAAGAATTC CTCGAG 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11; 
GCGACG CGTG CATGCCCACC GTACACGCCT ACTCGA 
(2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CATGGCATGC AAAAAAAAAG AGTCATCCGC TAGG 
(2> INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CATGGCATGC TTAGCGATTG GCATTATCAC AT 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TAATACGACT CACTATATAG GG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TCTAGACTTT GCCTTCGTTT ATC 23 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGAAGGCAAA GATGTCTAGA TTAGATAAAA G 31 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 49 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CGCGGATCCG CTTTCTCTTC TTTTTTGGAG ACCCACTTTC ACATTTAAG 49 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 
AATTGCTCGA GTACTGTATG TACATACAGT AG 32 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AATTCTACTG TATGTACATA CAGTACTCGA GC 32 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CCGGAATTCT CGAGACATAT CCATATCTAA TC 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2l: 
CCGGAATTCA CTAATCGCAT TAT CATC 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
CATGCCATGG CCATGTCTAG ATTAGATAAA AG 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GCGAATTCGC CAGGGCAACA GAATGCCACT 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 24: 
CGGGATCCTG GCTGGTTACC CAGGATGCCT TG 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

CGCGGATCCG GATGACCATG GACTCTGGAG 
30 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

CGCGGATCCT TAATCTGACT TGTGGCAGTA 
30 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

CGCGGATCCC GATGACCATG GAATCTGGAG CC 
32 

(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 3l base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 

CGCGGATCCG TGCTGCTTCT TCAGCAGGCT G 
31 

(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
ATGGTACCAG CGGCCGCTAG TCGTTTTACA ACGTCGTGAC 
(2) INFORMATION FOR SEQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 

ATGGTACCGC GGCCGCTTAT TTTTGACACC AGACCAAC 
38 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CGGAGATCTA AAGAGACTTT TCTCCGGAAC TCAG 
34 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CGGAGATCTT TACAGGAAGA CTGAACTGT 
29 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 

CCACCGCGGC AGTGCCAACC CCGATTTAC 
29 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 

CATCCGCGGT GGTGATGGCA GGGGCTGA 
28 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 35 

GG CTATCG AT ACGGCCCCCC CGACCGAT 
28 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 

GCGTATCGAT CTACCCACCG TACTCGTC 
28 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 34 base pairs 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 

CCTACTCTTA GGCCCGGGTC TTTTTAATGT ATCC 
34 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

GGAATCACTA CAGGGATG 
18 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 85 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



A 1 GG ACTTAA GAGTAGGAAG 


gaaatttcgt 


ATTGGCAGGA 


AGATTGGGAG 


TGGTTCCTTT 


60 


GGTGACATTT ACCACGGCAC 


GAACTTAATT 


AGTGGTGAAG 


AAGTAGCCAT 


CAAGCTGGAA 


120 


TCGATCAGGT CCAGACATCC 


TCAATTGGAC 


TATGAGTCCC 


GCGTCTACAG 


ATACTTAAGC 


180 


GGTGGTGTGG GAATCCCGTT 


CATCAGATGG 


TTTGGCAGAG 


AGGGTGAATA 


TAATGCTATG 


240 


UTuATCGATC TTCTAGGCCC 


ATCTTTGGAA 


GATTTATTCA 


ACTACTGTCA 


CAGAAGGTTC 


300 


T C CTTTAAG A CGGTTATCAT 


GCTGGCTTTG 


CAAATGTTTT 


GCCGTATTCA 


GTATATACAT 


360 


GGAAGGTCGT TCATTCATAG 


AGATATCAAA 


CCAGACAACT 


TTTTAATGGG 


GGTAGGACGC 


420 


CGTGGTAGCA CCGTTCATGT 


TATTGATTTC 


GGTCTATCAA 


AGAAATACCG 


AGATTTCAAC 


480 


AUiLAJ. CG 1 L- ATATTCCTTA 


CAGGGAGAAC 


AAGTC CTTGA 


CAGGTACAGC 


TCGTTATGCA 


540 


ALj lb 1 LAATA CGCATC1 rGG 


AATAGAGCAA AGTAGAAGAG 


ATGACTTAGA 


ATCACTAGGT 


600 


TATGTCTTGA TCTATTTTTG 


TAAGGGTTCT 


TTGCCATGGC 


AGGGTTTGAA 


AGCAACCACC 


660 


aagaaacaaa AGTATG ATC G 


TATCATGGAA AAGAAATTAA 


ACGTTAGCGT 


GGAAACTCTA 


720 


TGTTCAGGTT TACCATTAGA 


GTTTCAAGAA 


TATATGGCTT 


ACTGTAAGAA 


TTTGAAATTC 


780 


Vj A 1 LxACj AAu L CAGATTATTT 


GTTCTTGGCA 


AGGCTGTTTA 


AAGATCTGAG 


TATTAAACTA 


840 


V3/iO±.HI LHLH ALuALUALl 1 


GTTCGATTGG 


ACAATGTTGC 


GTTACACAAA 


GGCGATGGTG 


900 




CATCGAAAAA 


GGTGATTTGA 


ACGCAAATAG 


CAATGCAGCA 


960 


*Vj 1 uUAAu 1 A Av- AGCJACAGA 


CAACAAGTCT 


GAAACTTTCA 


ACAAGATTAA ACTGTTAGCC 


1020 


ATGAAGAAAT TCCCCACCCA 


TTTCCACTAT 


TACAAGAATG 


AAGACAAACA 


TAATCCTTCA 


1080 


CCAGAAGAGA TCAAACAACA 


AACTATCTTG 


AATAATAATG 


CAGCCTCTTC 


TTTACCAGAG 


1140 


GAATTATTGA ACGCACTAGA 


TAAAGGTATG 


GAAAACTTGA 


GACAACAGCA 


GCCGCAGCAG 


1200 


CAGGTCCAAA GTTCGCAGCC 


ACAACCACAG 


CCCCAACAGC 


TACAGCAGCA 


ACCAAATGGC 


1260 


CAAAGACCAA attattatcc 


TGAACCGTTA 


CTACAGCAGC 


AACAAAGAGA TTCTCAGGAG 


1320 


CAACAGCAGC AAGTTCCGAT 


GGCTACAACC 


AGGGCTACTC 


AGTATCCCCC 


ACAAATAAAC 


1380 


AGCAATAATT TTAATACTAA 


TCAAGCATCT GTACCTCCAC AAATGAGATC TAATCCACAA 


1440 


CAGCCGCCTC AAGATAAACC 


AG CTGGCCAG 


TCAATTTGGT TGTAA 




1485 


(2) INFORMATION FOR SEQ ID NO: 40: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2625 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 796.. 2580 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:40: 



CATTTTCTTA 


ATTCTTTTAT 


GTGCTTTTAC 


TACTTTGTTT 


AGTTCAAAAC 


AATAGTCGTT 


60 


ATTCTTAGGT 


ACTATAGCAT 


AAGACAAGAA 


AAGAAAAATA 


AGGGACAAAT 


AACATTAGCA 


120 


GAAGTACGGT 


ATATTTTACT 


GTTACTTATA 


TACTTTCAAG 


AAGATGAGTT 


AAATCGGTAG 


180 


CCAGTGTAGA 


AAAATAATAA 


TAAGGGTCAT 


CGATCCTTCG 


CATTTTATTA 


TCCAATTAAA 


240 


GATACGAATC 


ACGGCAAACT 


ATATTCAAAG 


CTCATAGATA 


ATCGTCGTAA 


GGCTGACACT 


300 


GCAGAAGAAA 


AGTCATAATT 


TGAATACTAG 


CCGGTATGAA 


ACTGTGATTG 


ATTAACCTGG 


360 


GGTTACCTAA 


AGAGAACATA 


AGTAATACTC 


ATGACAGAAT 


CAAAACACAA 


TACAAAATTT 


420 


ATCCGAACCT 


CGGCCCGACT 


GCGGCTCGCC 


GGGAAAGGGG 


ACAACCGCTT 


CTATCCGTCG 


480 


ACTAACTTCA 


TCGGCCCAAT 


GGAAGCTATG 


ATATGGGGAT 


TTCCATTGAG 


CCGATAGCAA 


540 


TGTAGGGTAA 


TACTGTTGCG 


TATATAGTGA 


TAGTTATTGA 


ATTTTATTAC 


CCTGCGGGAA 


600 


TATTGAGACA 


TCACTAAGCA 


CGAATTTTAC 


GTCTGAGGAA 


AGTTGAATGA 


TGGCCAAATA 


660 


ACCAGGAAAA 


ACAAATATTG 


AATCCTTGTG 


AAGGATTCCA 


CAGTTGTTTA 


ATCCTCCTTA 


720 


AGCTCACTTA 


GTATCAATTG 


TCTAAATAAT 


ATTGCTTTGA 


ATCTGAAAAA 


AATAAAAGTA 


780 


CCTTCGCATT 


AGACA ATG TCA CTG CCG 
Met Ser Leu Pro 


CTA CGA CAC GCA TTG GAG AAC GTT 
Leu Arg His Ala Leu Glu Asn Val 


831 



15 10 

ACT TCT GTT GAT AGA ATT TTA GAG GAC TTA TTA GTA CGT TTT ATT ATA 879 
Thr Ser Val Asp Arg lie Leu Glu Asp Leu Leu Val Arg Phe lie lie 
15 20 25 

AAT TGT CCG AAT GAA GAT TTA TCG AGT GTC GAG AGA GAG TTA TTT CAT 927 
Asn Cys Pro Asn Glu Asp Leu Ser Ser Val Glu Arg Glu Leu Phe His 
30 35 40 

TTT GAA GAA GCC TCA TGG TTT TAC ACG GAT TTC ATC AAA TTG ATG AAT 975 
Phe Glu Glu Ala Ser Trp Phe Tyr Thr Asp Phe lie Lys Leu Met Asn 
45 50 55 60 

CCA ACT TTA CCC TCC CTA AAG ATT AAA TCA TTT GCT CAA TTG ATC ATA 1023 
Pro Thr Leu Pro Ser Leu Lys He Lys Ser Phe Ala Gin Leu He He 
65 70 75 

AAA CTA TGT CCT CTG GTT TGG AAA TGG GAC ATA AGA GTG GAT GAG GCA 1071 
Lys Leu Cys Pro Leu Val Trp Lys Trp Asp He Arg Val Asp Glu Ala 
80 85 90 

CTC CAG CAA TTC TCC AAG TAT AAG AAA AGT ATA CCG GTG AGG GGC GCT 1119 
Leu Gin Gin Phe Ser Lys Tyr Lys Lys Ser He Pro Val Arg Gly Ala 
95 100 105 

GCC ATA TTT AAC GAG AAC CTG AGT AAA ATT TTA TTG GTA CAG GGT ACT 1167 
Ala He Phe Asn Glu Asn Leu Ser Lys He Leu Leu Val Gin Gly Thr 
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80- 



110 



115 



120 



GAA TCG GAT TCT TTG TCA TTC CCA AGG GGG AAG ATA TCT AAA GAT GAA 
Glu Ser Asp Ser Leu Ser Phe Pro Arg Gly Lys He Ser Lys Asp Glu 
125 130 135 140 

AAT GAC ATA GAT TGT TGC ATT AGA GAA GTG AAA GAA GAA ATT GGT TTC 
Asn Asp He Asp Cys Cys He Arg Glu Val Lys Glu Glu He Gly Phe 
145 150 155 

GAT TTG ACG GAC TAT ATT GAC GAC AAC CAA TTC ATT GAA AGA AAT ATT 
Asp Leu Thr Asp Tyr He Asp Asp Asn Gin Phe lie Glu Arg Asn He 
160 165 170 

CAA GGT AAA AAT TAC AAA ATA TTT TTG ATA TCT GGT GTT TCA GAA GTC 
Gin Gly Lys Asn Tyr Lys He Phe Leu He Ser Gly Val Ser Glu Val 
175 180 185 

TTC AAT TTT AAA CCT CAA GTT AGA AAT GAA ATT GAT AAG ATA GAA TGG 
Phe Asn Phe Lys Pro Gin Val Arg Asn Glu He Asp Lys He Glu Trp 
190 195 200 

TTC GAT TTT AAG AAA ATT TCT AAA ACA ATG TAC AAA TCA AAT ATC AAG 
Phe Asp Phe Lys Lys He Ser Lys Thr Met Tyr Lys Ser Asn He Lys 
205 210 215 220 

TAT TAT CTG ATT AAT TCC ATG ATG AGA CCC TTA TCA ATG TGG TTA AGG 
Tyr Tyr Leu He Asn Ser Met Met Arg Pro Leu Ser Met Trp Leu Arg 
225 230 235 

CAT CAG AGG CAA ATA AAA AAT GAA GAT CAA TTG AAA TCC TAT GCG GAA 
His Gin Arg Gin He Lys Asn Glu Asp Gin Leu Lys Ser Tyr Ala Glu 
240 245 250 

GAA CAA TTG AAA TTG TTG TTG GGT ATC ACT AAG GAG GAG CAG ATT GAT 
Glu Gin Leu Lys Leu Leu Leu Gly He Thr Lys Glu Glu Gin He Asp 
255 260 265 

CCC GGT AGA GAG TTG CTG AAT ATG TTA CAT ACT GCA GTG CAA GCT AAC 
Pro Gly Arg Glu Leu Leu Asn Met Leu His Thr Ala Val Gin Ala Asn 
270 275 280 

AGT AAT AAT AAT GCG GTC TCC AAC GGA CAG GTA CCC TCG AGC CAA GAG 
Ser Asn Asn Asn Ala Val Ser Asn Gly Gin Val Pro Ser Ser Gin Glu 
285 290 295 300 

CTT CAG CAT TTG AAA GAG CAA TCA GGA GAA CAC AAC CAA CAG AAG GAT 
Leu Gin His Leu Lys Glu Gin Ser Gly Glu His Asn Gin Gin Lys Asp 
305 310 315 

CAG CAG TCA TCG TTT TCT TCT CAA CAA CAA CCT TCA ATA TTT CCA TCT 
Gin Gin Ser Ser Phe Ser Ser Gin Gin Gin Pro Ser He Phe Pro Ser 
32 0 325 330 

CTT TCT GAA CCG TTT GCT AAC AAT AAG AAT GTT ATA CCA CCT ACT ATG 
Leu Ser Glu Pro Phe Ala Asn Asn Lys Asn Val He Pro Pro Thr Met 
33 5 340 345 

p™ mI? f C ?™ TTC ATG TCA AAT CCT CAA TTG TTT GCG ACA ATG 

Pro Met Ala Asn Val Phe Met Ser Asn Pro Gin Leu Phe Ala Thr Met 
350 355 3 6 o 

AAT GGC CAG CCT TTT GCA CCT TTC CCA TTT ATG TTA CCA TTA ACT AAC 



1215 



1263 



1311 



1359 



1407 



1455 



1503 



1551 



1599 



1647 



1695 



1743 



1791 



1839 



1887 



1935 
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Asn Gly Gin Pro Phe Ala Pro Phe Pro Phe Met Leu Pro Leu Thr Asn 
365 370 375 380 

AAT AGT AAT AGC GCT AAC CCT ATT CCA ACT CCG GTC CCC CCT AAT TTT 1983 
Asn Ser Asn Ser Ala Asn Pro lie Pro Thr Pro Val Pro Pro Asn Phe 
385 390 395 

AAT GCT CCT CCG AAT CCG ATG GCT TTT GGT GTT CCA AAC ATG CAT AAC 2031 
Asn Ala Pro Pro Asn Pro Met Ala Phe Gly Val Pro Asn Met His Asn 
400 • 405 410 

CTT TCT GGA CCA GCA GTA TCT CAA CCG TTT TCC TTG CCT CCT GCT CCT 2079 
Leu Ser Gly Pro Ala Val Ser Gin Pro Phe Ser Leu Pro Pro Ala Pro 
415 420 425 

TTA CCG AGG GAC TCT GGT TAC AGC AGC TCC TCC CCT GGG CAG TTG TTA 2127 
Leu Pro Arg Asp Ser Gly Tyr Ser Ser Ser Ser Pro Gly Gin Leu Leu 
430 435 440 

GAT ATA CTA AAT TCG AAA AAG CCT GAC AGC AAC GTG CAA TCA AGC AAA 2175 
Asp He Leu Asn Ser Lys Lys Pro Asp Ser Asn Val Gin Ser Ser Lys 
445 450 455 460 

AAG CCA AAG CTT AAA ATC TTA CAG AGA GGA ACG GAC TTG AAT TCA CTC 2223 
Lys Pro Lys Leu Lys He Leu Gin Arg Gly Thr Asp Leu Asn Ser Leu 
465 470 475 

AAG CAA AAC AAT AAT GAT GAA ACT GCT CAT TCA AAC TCT CAA GCT TTG 2271 
Lys Gin Asn Asn Asn Asp Glu Thr Ala His Ser Asn Ser Gin Ala Leu 
480 485 490 

CTA GAT TTG TTG AAA AAA CCA ACA TCA TCG CAG AAG ATA CAC GCT TCC 2319 
Leu Asp Leu Leu Lys Lys Pro Thr Ser Ser Gin Lys He His Ala Ser 
495 500 505 

AAA CCA GAT ACT TCC TTT TTA CCA AAT GAC TCC GTA TCT GGT ATA CAA 2367 
Lys Pro Asp Thr Ser Phe Leu Pro Asn Asp Ser Val Ser Gly He Gin 
510 515 520 

GAT GCA GAA TAT GAA GAT TTC GAG AGT AGT TCA GAT GAA GAG GTG GAG 2415 
Asp Ala Glu Tyr Glu Asp Phe Glu Ser Ser Ser Asp Glu Glu Val Glu 
525 530 535 540 

ACA GCT AGA GAT GAA AGA AAT TCA TTG AAT GTA GAT ATT GGG GTG AAC 2463 
Thr Ala Arg Asp Glu Arg Asn Ser Leu Asn Val Asp He Gly Val Asn 
545 550 555 

GTT ATG CCA AGC GAA AAA GAC AGC CGA AGA AGT CAA AAG GAA AAA CCA 2511 
Val Met Pro Ser Glu Lys Asp Ser Arg Arg Ser Gin Lys Glu Lys Pro 
560 565 570 

AGG AAC GAC GCA AGC AAA ACA AAC TTG AAC GCT TCT GCA GAA TCT AAT 2559 
Arg Asn Asp Ala Ser Lys Thr Asn Leu Asn Ala Ser Ala Glu Ser Asn 
575 580 585 

AGT GTA GAA TGG GGG GCT GGG TAAATCTTCA CCCTCCGACT TCAGAGTAAC 2610 
Ser Val Glu Trp Gly Ala Gly 
590 595 



ACAGAATCCA CAGTA 



2625 
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(2) INFORMATION FOR SEQ ID NO;41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6854 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 2050.. 4053 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 



AGCTTCTCCC 


TTTTCCTTCA 


GTGCTGCTAC 


TCTCTG CTCT 


CCACTTAAGT 


GTTACAATTA 


60 


ATTTGCAGCT 


AGTTTG CAGT 


TCGTACAACC 


TCGCCTATTC 


TTGTAACGAA 


GAAGAACGTA 


120 


TTTATAATAT 


TGGGCTGTAA 


TGTGTTGAGT 


TTAGTAATAG 


ATAAAGTAGG 


ACAGAGTTCT 


180 


GTCTTTGTTT 


ATCTATGGGG 


TTCAGAGTGA 


TAAGGGGCAG 


GATAAGGAAG 


TTAAAAAAAA 


240 


AAAGGTTACG 


TTATATAACG 


AAAGAAAAGA 


AACGAGCGAA 


GTGCCAACTA 


TAGCCCAATA 


300 


TCAAGAATGC 


AAGTCAGCAA 


AGTACAGTAA 


TCGTATGAAG 


ATACGCGATG 


CGTAATATCC 


360 


CTCAAGGGCT 


CCGGATCAGA 


AAAGCTAAGG 


GAAGATCCTT 


ACATTACACG 


GCGTGCGACA 


420 


GACTCGAACC 


ACAGCTAACT 


TCTCGTGAAA 


AGATGGCTTC 


AACTTCGCTC 


TTGCAATAAC 


480 


TTTGAAACAC 


ACGAACAAAG 


GTTTATTGCG 


CTTGATTAAC 


GTTGGAAGTA 


TATGATACTA 


540 


ATACTACTTT 


GTTCTCTAAG 


TCATCGCTAT 


ATGTTTATCT 


CGAGGAAAAG 


GTGCACGGCG 


600 


GTACACAATT 


ACTTCGCCGT 


TTCGGGTAAA 


ACAAGTGTTA 


CATTTATAAT 


ATATATGTAT 


660 


ATATGTATGT 


GCGCGTAAGT 


ATATGCCGTT 


CATAACAAAT 


CATCTTCTTG 


TTGCTGGATG 


720 


GACTCCTTAA 


TTTTATTCAA 


AATGGTAATT 


TTCCATTTAT 


CTAGTCTCAT 


AAAATTGTCA 


780 


AACTCCTTAC 


AGTGTTCGCT 


TAGCTGCTCG 


CTATCACCTT 


CATTAACAGC 


ATCGATTAAA 


840 


CTTTTCAAGA 


AATTTGACTC 


CCTTGAATCC 


GCAAAATTCG 


GATCTTCACT 


TTGACCCTCT 


900 


TGTAAAGTTC 


TTGCAGCAGC 


GACTGCATCA 


GTAGCAGCTA 


GCTGACAAAG 


CCCTTTTTTT 


960 


AGGAAGTAAT 


CCTTCAAACT 


CCATTGGCTC 


AATCTATTGC 


CCATGCTGCT 


CTTGATCAAC 


1020 


TTCGAATATA 


TATCACTTGC 


TTCAATATAT 


TGACCGTCAA 


GAGCCTTTAG ATCTGCGCAT 


1080 


TTGATAAAAC 


ACTTATTCGA 


TAATGCTACC 


GACTGGTCTT 


GGGCATACCA CTCACCAGCG 


1140 


AGCTCATAGC 


AATCTATAGC 


TTTTGCATAG 


TCATGCAAAT 


CATTTTCTAG 


AATTTCTCCA 


1200 


AGCTCAAACT 


TGAAATTAGC 


ACCTCTCCGG 


AACTGCCCCC 


TATGAGTAAA 


AATTTGAATA 


1260 


GCATTTTCTA 


ATGAATCCAC 


GGCGTTCACA 


GAGTTTCCAC 


CGCTTTTAAA GCATTTATAA 


1320 


GCCTCTACGT 


AGGTATTTCC 


TGCTTCGTCT 


TCATTACCAG 


CCTTTTTCTG 


ATAGTCAGCA 


1380 
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GCTTTCAAAA ACGAGTCTCC TGCCAAGTTT AACTCTTTTC TTAGACGGTA AATGGTGGCT 144 0 

GCTTGGACAC AAAGATCAGC AGCCTCCTCA AACTTGTATG AATCAGAACC GCTAAACAAT 1500 

TTCATGAAAC CCGATGAAGG AACACCCTTC TTCTCAGCCT TAACACAACG GGAAATATCA 1560 

ATTCCCGTAT TTCAATGTTA GTAATTTGCC TTCGTAAATT ACGGAATCAC ATAGCTTTCA 1620 

TTTTGTTCCT TTGATATATT TCCCTACTAC ATACTCTTTT CAATAACTCT ACAGGGTCTG 1680 

ACATTTTTAA CTTTCAGGTT AATGATGGTG TTCTTACTAT ATTCTCGAGT CGTACAGAAG 1740 

TTAGTTCAGA TAAACTGCTT CGGTGCTGCC CACTTCTTAT CATTACTTCA ACTTTACCTT 1800 

CCCTATACCT GTGTGTCCTT ATTAATTCAA GTTAATCCGA GGTAATAGAT TAGGGTAACC 1860 

TTCAATGATG TCACGAAACA CGGATGCTGC AACTTTGCGA TTTTTTCCTG GAAAAGAATA 192 0 

ACAATTAAAG GCAGCCTTTC AGCTGAGATT ACCAGCAGGT CTTTGGAGAT TAGCGCAAGA 1980 

AGAAGTGTGA TATAGTACTC ATAGAGGCAG GCTACAGACT AGGGAAAGCG TGTTCAACAA 2040 

CAATAAGAA ATG GAG ACC AGT TCT TTT GAG AAT GCT CCT CCT GCA GCC 2088 
Met Glu Thr Ser Ser Phe Glu Asn Ala Pro Pro Ala Ala 
15 10 

ATC AAT GAT GCT CAG GAT AAT AAT ATA AAT ACG GAG ACT AAT GAC CAG 2136 
lie Asn Asp Ala Gin Asp Asn Asn He Asn Thr Glu Thr Asn Asp Gin 
15 20 25 

GAA ACA AAT CAG CAA TCT ATC GAA ACT AGA GAT GCA ATT GAC AAA GAA 2184 
Glu Thr Asn Gin Gin Ser lie Glu Thr Arg Asp Ala He Asp Lys Glu 
30 35 40 ~ 45 

AAC GGT GTG CAA ACG GAA ACT GGT GAG AAC TCT GCA AAA AAT GCC GAA 2232 
Asn Gly Val Gin Thr Glu Thr Gly Glu Asn Ser Ala Lys Asn Ala Glu 
50 55 60 

CAA AAC GTT TCT TCT ACA AAT TTG AAT AAT GCC CCC ACC AAT GGT GCT 2280 
Gin Asn Val Ser Ser Thr Asn Leu Asn Asn Ala Pro Thr Asn Gly Ala 
65 70 75 

TTG GAC GAT GAT GTT ATC CCA AAT GCT ATT GTT ATT AAA AAC ATT CCG 2328 
Leu Asp Asp Asp Val He Pro Asn Ala He Val He Lys Asn He Pro 
80 85 90 

TTT GCT ATT AAA AAA GAG CAA TTG TTA GAC ATT ATT GAA GAA ATG GAT 2376 
Phe Ala He Lys Lys Glu Gin Leu Leu Asp He lie Glu Glu Met Asp 
95 100 105 

CTT CCC CTT CCT TAT GCC TTC AAT TAC CAC TTT GAT AAC GGT ATT TTC 2424 
Leu Pro Leu Pro Tyr Ala Phe Asn Tyr His Phe Asp Asn Gly He Phe 
HO H5 120 125 

AGA GGA CTA GCC TTT GCG AAT TTC ACC ACT CCT GAA GAA ACT ACT CAA 2472 
Arg Gly Leu Ala Phe Ala Asn Phe Thr Thr Pro Glu Glu Thr Thr Gin 
130 135 140 

GTG ATA ACT TCT TTG AAT GGA AAG GAA ATC AGC GGG AGG AAA TTG AAA 2520 
Val He Thr Ser Leu Asn Gly Lys Glu He Ser Gly Arg Lys Leu Lys 
145 150 155 
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GTG GAA TAT AAA AAA ATG CTT CCC CAA GCT GAA AGA GAA AGA ATC GAG 2568 
Val Glu Tyr Lys Lys Met Leu Pro Gin Ala Glu Arg Glu Arg He Glu 
160 165 170 

AGG GAG AAG AGA GAG AAA AGA GGA CAA TTA GAA GAA CAA CAC AGA TCG 2616 
Arg Glu Lys Arg Glu Lys Arg Gly Gin Leu Glu Glu Gin His Arg Ser 
175 180 185 

TCA TCT AAT CTT TCT TTG GAT TCT TTA TCT AAA ATG AGT GGA AGC GGA 2664 
Ser Ser Asn Leu Ser Leu Asp Ser Leu Ser Lys Met Ser Gly Ser Gly 
190 195 200 205 

AAC AAT AAT ACT TCT AAC AAT CAA TTA TTC TCG ACT CTA ATG AAC GGC 2712 
Asn Asn Asn Thr Ser Asn Asn Gin Leu Phe Ser Thr Leu Met Asn Gly 
210 215 220 

ATT AAT GCT AAT AGC ATG ATG AAC AGT CCA ATG AAT AAT ACC ATT AAC 2760 
He Asn Ala Asn Ser Met Met Asn Ser Pro Met Asn Asn Thr He Asn 
225 230 235 

AAT AAC AGT TCT AAT AAC AAC AAT AGT GGT AAC ATC ATT CTG AAC CAA 2808 
Asn Asn Ser Ser Asn Asn Asn Asn Ser Gly Asn He He Leu Asn Gin 
240 245 250 

CCT TCA CTT TCT GCC CAA CAT ACT TCT TCA TCG TTG TAC CAA ACA AAC 2856 
Pro Ser Leu Ser Ala Gin His Thr Ser Ser Ser Leu Tyr Gin Thr Asn 
255 260 265 

GTT AAT AAT CAA GCC CAG ATG TCC ACT GAG AGA TTT TAT GCG CCT TTA 2904 
Val Asn Asn Gin Ala Gin Met Ser Thr Glu Arg Phe Tyr Ala Pro Leu 
270 275 280 285 

CCA TCA ACT TCC ACT TTG CCT CTC CCA CCC CAA CAA CTG GAC TTC AAT 2952 
Pro Ser Thr Ser Thr Leu Pro Leu Pro Pro Gin Gin Leu Asp Phe Asn 
290 295 300 

GAC CCT GAC ACT TTG GAA ATT TAT TCC CAA TTA TTG TTA TTT AAG GAT 3000 
Asp Pro Asp Thr Leu Glu lie Tyr Ser Gin Leu Leu Leu Phe Lys Asp 
305 310 315 

AGA GAA AAG TAT TAT TAC GAG TTG GCT TAT CCC ATG GGT ATA TCC GCT 304 8 

Arg Glu Lys Tyr Tyr Tyr Glu Leu Ala Tyr Pro Met Gly He Ser Ala 
320 325 330 

TCC CAC AAG AGA ATT ATC AAT GTT TTG TGC TCG TAC TTA GGG CTA GTA 3096 
Ser His Lys Arg He He Asn Val Leu Cys Ser Tyr Leu Gly Leu Val 
335 340 345 

GAA GTA TAT GAT CCA AGA TTT ATT ATT ATC AGA AGA AAG ATT CTG GAT 3144 
Glu Val Tyr Asp Pro Arg Phe He He He Arg Arg Lys He Leu Asp 
350 355 360 365 

CAT GCT AAT TTA CAA TCT CAT TTG CAA CAA CAA GGT CAA ATG ACA TCT 3192 
His Ala Asn Leu Gin Ser His Leu Gin Gin Gin Gly Gin Met Thr Ser 
370 375 380 

GCT CAT CCT TTG CAG CCA AAC TCC ACT GGC GGC TCC ATG AAT AGG TCA 3240 
Ala His Pro Leu Gin Pro Asn Ser Thr Gly Gly Ser Met Asn Arg Ser 
385 390 395 

CAA TCT TAT ACA AGT TTG TTA CAG GCC CAT GCA GCA GCT GCA GCG AAT 3288 
Gin Ser Tyr Thr Ser Leu Leu Gin Ala His Ala Ala Ala Ala Ala Asn 
400 405 410 



WO 98/13502 



PCT/US97/17276 



-85- 

AGT ATT AGC AAT CAG GCC GTT AAC AAT TCT TCC AAC AGC AAT ACT ATT 3336 
Ser lie Ser Asn Gin Ala Val Asn Asn Ser Ser Asn Ser Asn Thr lie 
415 420 425 

AAC AGT AAT AAC GGT AAC GGT AAC AAT GTC ATC ATT AAT AAC AAT AGC 3384 
Asn Ser Asn Asn Gly Asn Gly Asn Asn Val lie lie Asn Asn Asn Ser 
430 435 440 445 

GCC AGC TCA ACA CCA AAA ATT TCT TCA CAG GGA CAA TTC TCC ATG CAA 3432 
Ala Ser Ser Thr Pro Lys lie Ser Ser Gin Gly Gin Phe Ser Met Gin 
450 455 460 

CCA ACA CTA ACC TCA CCT AAA ATG AAC ATA CAC CAT AGT TCT CAA TAC 3480 
Pro Thr Leu Thr Ser Pro Lys Met Asn He His His Ser Ser Gin Tyr 
465 470 475 

AAT TCC GCA GAC CAA CCG CAA CAA CCT CAA CCA CAA ACA CAG CAA AAT 3528 
Asn Ser Ala Asp Gin Pro Gin Gin Pro Gin Pro Gin Thr Gin Gin Asn 
480 485 490 

GTT CAG TCA GCT GCG CAA CAA CAA CAA TCT TTT TTA AGA CAA CAA GCT 3576 
Val Gin Ser Ala Ala Gin Gin Gin Gin Ser Phe Leu Arg Gin Gin Ala 
495 500 505 

ACT TTA ACA CCA TCC TCA AGA ATT CCA TCC GGT TAT TCT GCC AAC CAT 3624 
Thr Leu Thr Pro Ser Ser Arg He Pro Ser Gly Tyr Ser Ala Asn His 
510 515 520 525 

TAT CAA ATC AAT TCC GTT AAT CCC TTA CTG AGA AAT TCT CAA ATT TCA 3672 
Tyr Gin He Asn Ser Val Asn Pro Leu Leu Arg Asn Ser Gin He Ser 
530 535 540 

CCT CCA AAT TCA CAA ATC CCA ATC AAC AGC CAA ACC CTA TCC CAA GCG 3720 
Pro Pro Asn Ser Gin He Pro He Asn Ser Gin Thr Leu Ser Gin Ala 
545 550 555 

CAA CCA CCA GCA CAG TCC CAA ACT CAA CAA CGG GTA CCA GTG GCA TAC 3768 
Gin Pro Pro Ala Gin Ser Gin Thr Gin Gin Arg Val Pro Val Ala Tyr 
560 565 570 

CAA AAT GCT TCA TTG TCT TCC CAG CAG TTG TAC AAC CTT AAC GGC CCA 3816 
Gin Asn Ala Ser Leu Ser Ser Gin Gin Leu Tyr Asn Leu Asn Gly Pro 
575 580 585 

TCT TCA GCA AAC TCA CAG TCC CAA CTG CTT CCA CAG CAC ACA AAT GGC 3864 
Ser Ser Ala Asn Ser Gin Ser Gin Leu Leu Pro Gin His Thr Asn Gly 
590 595 600 605 

TCA GTA CAT TCT AAT TTC TCA TAT CAG TCT TAT CAC GAT GAG TCC ATG 3912 
Ser Val His Ser Asn Phe Ser Tyr Gin Ser Tyr His Asp Glu Ser Met 
610 615 * 620 

TTG TCC GCA CAC AAT TTG AAT AGT GCC GAC TTG ATC TAT AAA TCT TTG 3960 
Leu Ser Ala His Asn Leu Asn Ser Ala Asp Leu He Tyr Lys Ser Leu 
625 630 635 

AGT CAC TCT GGA CTA GAT GAT GGC TTG GAA CAG GGC TTG AAT CGT TCT 4008 
Ser His Ser Gly Leu Asp Asp Gly Leu Glu Gin Gly Leu Asn Arg Ser 
640 645 650 

TTA AGC GGA CTG GAT TTA CAA AAC CAA AAC AAG AAG AAT CTA TGG 4053 
Leu Ser Gly Leu Asp Leu Gin Asn Gin Asn Lys Lys Asn Leu Trp 
655 660 665 
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TAATATATAC TTCCATTATT CTATGATTAT AGAGTTTGTT TGGTATTTGT ATATCGCACG 4113 

ATACAAGTAA TGAGGGGTGC TTACACAAGA TAAAAGATAA AAAAATATAT ATATATAATA 4173 

AAAACCATCA AAAACACCAT TGAAAAAAAA TATAAAAAAA AAAAAAAATA ACCGAATATG 4233 

AATATGAAAT TAATGATCAT GATGAAGTTA ATTTTTACTG AGAAACGTCA CCTAATGTCG 429 3 

ATGAAACGAT GATAATGAAT GAATGATGAG G CTACTTTAA GTAACGCAAT GTAATCAAGC 4 353 

CAAAATTATC CCTCTTTTTT TTTTTTCCCT CTTTTGAGAT TTTATTTTTA ACCTACTACT 4413 

TACTTTTTTT TTTTGAACGT TCTTTTCCCA CATACTTTTA TATATGGTAT TTATATGTAC 4473 

GATGTTTAAT CACAGAGATG TTTCTACCTT ACTCGATATT GTTTTTGCAT TAATTGATAT 4533 

CTTGCTCACT GCATCATTGG CGGTATTTGT AGTATATAGA AAGTCGGGTA ACAATAATT*:: 4593 

ATTGACATTT CTTTGTTTAC AATGATCAGA GAAGAGCAGA AAGTTTCATA GTCAAACGTT 4 653 

CAGGCCAATT GAACAAGAAA TTATTCGTTT TTTTAGTCGT TGAGTGTTCA ACTGACATGC 4713 

TATTTTGGTG GTTCTTGATT AATTGGGGGC TTCATTGTTT GAAATAAAGA GTCGGGAAAA 4773 

TAGCACAGAA ACAAAGCATA TTAAAAGAGG CAAAAGAAGA AAGAACGAAT ATAAAAGGTA 4833 

AAAAAGGAAA AGCATTGCTA TTCTTTTCTC ATAGGTGTTA TTCATACCGC CCTCTCTCTT 4 893 

CTTCCTTCTT CATTAATTAG TCTCCGTATA ATTTGCAGAT AATGTCATTA ACAGCAAACG 4953 

ACGAATCGCC AAAACCCAAA AAAAATGCAT TATTGAAAAA CTTAGAGATC GATGATCTGA 5013 

TACATTCTCA ATTTGTCAGA AGCGATACAA ATGGACATAG AACTACAAGA CGACTATTGA 5073 

ACTCCGATGC CAGTATATCA CATCGAATAA GAGGAAGTGT TCGGTCTGAT AAAGGCCTTA 5133 

ATAAAATAAA AAAAGGGTTG ATTTCCCAGC AGTCCAAACT TGCGTCAGAA AATTCTTCTC 5193 

AAAATATCGT TAATAGGGAC AATAAGATGG GAGCAGTAAG TTTCCCCATT ATTGAACCTA 5253 

ATATTGAAGT CAGCGAGGAG TTGAAGGTTA GAATTAAGTA TGATTCTATC AAATTTTTCA 5313 

ATTTTGAAAG ACTAATATCT AAATCTTCAG TCATAGCACC TTTAGTTAAC AAAAATATAA 5373 

CATCATCCGG TCCTCTAATC GGGTTTCAAA GAAGAGTTAA CAGGTTAAAG CAAACATGGG 5433 

ATCTAGCAAC CGAAAACATG GAGTACCCAT ATTCTTCTGA TAATACG CCA TTCAGGGATA 5493 

ACGATTCTTG GCAATGGTAC GTACCATACG GCGGAACAAT AAAAAAAATG AAAGATTTCA 5553 

GTACAAAAAG AACTTTACCC ACCTGGGAAG ATAAAATAAA GTTTCTTACA TTTTTAGAAA 5613 

ACTCTAAGTC TGCAACGTAC ATTAATGGTA ACGTATCACT TTGCAATCAT AATGAAACCC 5673 

ATCAAGAAAA CGAAGATAGG AAAAAAAGGA AAGGGAAAGT ACCAAGAATC AAAAATAAAG 5733 

TGTGGTTTTC CCAGATAGAA TACATTGTTC TTCGAAATTA TGAAATTAAA CCTTGGTATA 5793 

CATCTCCTTT TCCGGAACAC ATCAACCAAA ATAAAATGGT TTTTATATGT GAGTTCTGCC 5853 

TAAAATATAT GACTTCTCGA TATACTTTTT ATAGACACCA ACTAAAGTGT CTAACTTTTA. 5913 

AGCCCCCCGG AAATGAAATT TATCGCGACG GTAAGCTGTC TGTTTGGGAA ATTGATGGGC 5973 
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GGGAGAATGT 


CTTGTATTGT 


CAAAATCTTT GCCTGTTGGC 


AAAATGTTTT 


ATCAATTCTA 


6033 


AGACTTTGTA 


TTACGATGTT 


GAACCGTTTA TATTCTATAT 


TCTAACGGAG 


AGAGAGGATA 


6093 


CAGAGAACCA 


TCCCTATCAA 


AACGCAGCCA AATTCCATTT 


CGTAGGCTAT 


TTCTCCAAGG 


6153 


AAAAATTCAA 


CTCCAATGAC 


TATAACCTAA GTTGTATTTT 


AACTCTACCC 


ATATACCAGA 


6213 


GGAAAGGATA 


TGGTCAGTTT 


TTGATGGAAT TTTCATATTT 


ATTATCCAGA AAGGAGTCAA 


6273 


AATTTGGAAC 


TCCTGAAAAA 


CCATTGTCGG ATTTAGGATT 


ATTGACTTAC 


AGAACGTTTT 


6333 






o I IjL 1 Ali AA AATTAAG AG A 


CAGTGCTAGA 


CGTCGATCAA 


6393 


ATAATAAAAA 


TGAAGATACT 


TTTCAGCAGG TTAGCCTAAA 


CGATATCGCT AAACTAACAG 


6453 


GAATGATACC 


AACAGACGTT 


GTGTTTGGAT TGGAACAACT 


TCAAGTTTTG 


TATCGCCATA 


6513 


AAACACGCTC 


ATTATC CAGT 


TTGGATGATT TCAACTATAT 


TATTAAAATC 


GATTCTTGGA 


6573 


ACAGGATTGA 


AAATATTTAC 


AAAACTTGGA GCTCAAAAAA 


CTATCCTCGC 


GTCAAATATG 


6633 


ACAAACTATT 


GTGGGAACCT 


ATTATATTAG GGCCGTCATT 


TGGTATAAAT 


GGGATGATGA 


6693 


ACTTAGAACC 


CACCGCATTA 


GCGGACGAAG CTCTTACAAA 


TGAAACTATG 


GCTCCGGTAA 


6753 


TTTCGAATAA 


CACACATATA 


GAAAACTATA ACAACAGTAG 


AGCACATAAT 


AAACGCAGAA 


6813 


GAAGAAGAAG 


AAGAAGTAGT 


GAGCACAAAA CATCCAAGCT 


T 




6854 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2814 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..696 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GAA TTC CAA TAC ACC AAA CAG CTG CAT TTC CCT GTG GGG CCC AAA TCC 48 
Glu Phe Gin Tyr Thr Lys Gin Leu His Phe Pro Val Gly Pro Lys Ser 
15 10 15 

ACA AAC TGT GAG GTA GCG GAA ATT CTT TTA CAC TGC GAC TGG GAA AGG 96 
Thr Asn Cys Glu Val Ala Glu lie Leu Leu His Cys Asp Trp Glu Arg 
20 25 30 

TAC ATA AAT GTT TTA AGT ATA ACA AGA ACA CCA AAT GTT CCT AGT GGT 144 
Tyr lie Asn Val Leu Ser lie Thr Arg Thr Pro Asn Val Pro Ser Gly 
35 40 45 

ACC AGT TTC AGC ACC AGA ACG AGG TAC ATG TTC CGA TGG GAT GAC CAG 192 
Thr Ser Phe Ser Thr Arg Thr Arg Tyr Met Phe Arg Trp Asp Asp Gin 
50 55 60 
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GGG CAA GGT TGC ATA TTA AAA ATA AGT TTT TGG GTG GAC TGG AAC GCA 24 0 

Gly Gin Gly Cys lie Leu Lys lie Ser Phe Trp Val Asp Trp Asn Ala 
65 70 75 80 

TCC AGT TGG ATC AAG CCA ATG GTA GAG AGC AAT TGT AAA AAT GGA CAA 288 
Ser Ser Trp He Lys Pro Met Val Glu Ser Asn Cys Lys Asn Gly Gin 
85 90 95 

ATT AGC GCC ACT AAG GAC TTG GTA AAG TTA GTC GAA GAA TTT GTA GAG 336 
He Ser Ala Thr Lys Asp Leu Val Lys Leu Val Glu Glu Phe Val Glu 
100 105 no 

AAA TAC GTG GAA TTG AGC AAA GAA AAA GCA GAT ACA CTC AAG CCG TTG 384 
Lys Tyr Val Glu Leu Ser Lys Glu Lys Ala Asp Thr Leu Lys Pro Leu 
115 120 125 

CCC AGT GTT ACA TCT TTT GGA TCA CCT AGG AAA GTG GCA GCA CCG GAG 432 
Pro Ser Val Thr Ser Phe Gly Ser Pro Arg Lys Val Ala Ala Pro Glu 
130 135 140 

CTG TCG ATG GTA CAG CCG GAG TCG AAA CCA GAA GCT GAG GCG GAA ATC 480 
Leu Ser Met Val Gin Pro Glu Ser Lys Pro Glu Ala Glu Ala Glu He 
145 150 155 160 

TCA GAA ATA GGC AGC GAC AGA TGG AGG TTT AAC TGG GTG AAC ATA ATA 528 
Ser Glu He Gly Ser Asp Arg Trp Arg Phe Asn Trp Val Asn He He 
165 170 175 

ATC TTG GTG CTC TTG GTG TTA AAT CTG CTG TAT TTA ATG AAG TTG AAC 576 
He Leu Val Leu Leu Val Leu Asn Leu Leu Tyr Leu Met Lys Leu Asn 
180 185 190 

AAG AAG ATG GAT AAG CTG ACG AAC CTC ATG ACC CAC AAG GAC GAA GTT 624 
Lys Lys Met Asp Lys Leu Thr Asn Leu Met Thr His Lys Asp Glu Val 
195 200 205 

GTA GCG CAC GCG ACT CTA TTG GAC ATA CCA GCC CAA GTA CAA TGG TCA 672 
Val Ala His Ala Thr Leu Leu Asp He Pro Ala Gin Val Gin Trp Ser 
210 215 220 

AGA CCA AGA AGG GGA GAC GTG TTG TAACAGAGTA ATCATGTAAT ATTGTATGTA 726 
Arg Pro Arg Arg Gly Asp Val Leu 
225 230 

AGGTTATGTA TGTTCGTATG GTATGGAAAA AAAAAAAAAA AAAGGATGCT ATGTGGAGAA 786 

TGTAAGGCGT GGTAGCTCCG GATAATTCAG TCTGTAGGCT TCATCACGGG CAGTGGCCTG 846 

ACTCTGAGAG CTTGCTCCGG TATTAAGTTG TGCGTTTGAA ATTTTCTGGA AAAAAGAAAT 906 

TGATTGGTTG AAGCTATACT CGTCGAAAGA TTTCTTCGGC AGTGGTTGTT GCTCCACCTG 966 

CACGGGAGTT GTGTTTGCGT TTATGTTCGG CTTGGCTATA TTATTAGCGA GTGATGTTTG 1026 

CAATTTGCTG TATTGAGAAT CAATTTGGGT GCGTAAGCTT TCAATAATTT TGCAGACCGC 1086 

AGGCACTTCC AACTTTATGA GTTGCAGGTA TTCTCTTTTA TGAATATACG ATGACGACGA 1146 

TGACGACGAC GCATCCATGC GCAAAAGCTC AGGGTGTCTA GATAGTTTGT TAGTCAATAA 1206 

ATCCACATAT CTAAAATAAT AAATAAACGA CAGCGACAAG TCGTTGGCCT GGAACGCACA 1266 

CTGTGCCTTT TCCAATATGC CGATGCATGT TTTCAGGTAA ATTCTCAATG GTATCGCCGG 1326 
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ATTGAAGCGA TAATCCTTAG CGTCCTGAAC CAATTGCTTA CTAGACTTCA TGACCTACCG 1386 

GGGCCAGATA AAGATGCGGA AGGAAGAGAA AAAATGTATA GTGGTTGGTG AACCGCAACA 1446 

ATAATTCGTG CCAACACTTT AATCGAAGCA AAAATTGTCT TGTATGTTAT TAATATTATC 1506 

TATCTAACCA TTGATTTACG TATAAAACTG TCGATGCTCA TCGCCTAGCA ATGAAAAAAT 1566 

TTTTTCTTTT TTTTTTCATT ATTTCTCTTT GTTGCGTACT TTTTTTCATT GCGTTTCGCG 1626 

GCAAAAGCGA TTCGAGTTGA CTGGAAGTGT GTTATACTAT AAAAAGTGTA TATGCCTATT 16 B 6 

TTTGGTTCTG ATCTTTACTT TACTGTTAAG TACTGGCTGA GGCAGTAGAC TCTGCCTCTG 1746 

TTACGGCAGC GGTATTCGCC TCGGCATCAG CAGCCGCCCA CGGTAGAGTA GGTTCTGTTG 1806 

TTTTGACGTT TGCCAAGGTA CTGTCCAAAT GCTCCTTCAG CAAGGCCTCA TTACTTTCCT 1866 

TCTCCGGACC CACCGATTGC GTGATCTCCT GTACACGGTT CAAGAACTTG TTCAAATTGT 192 6 

AGCCCGCAGC AGCATCAGAG ACTTCTTGTG TGTAAGGGAC ACCCCTCAAC TCCTTGACTC 19 86 

TTCTTTTGTG CACTTTGCCC TTTAAATGCG TTTTTAACGC TATAGCAGTC TCCATGTATT 2046 

TGGCACAGTG TATGCAATAG TGCTGACCAA GGCCCGGTTT GGTTTCATCC AATGGCTGGT 2106 

TCAGAAGCTT CTGTACTGAT TCCTTGGTGG ACAAATCGTT ATAGATCAGG TCCAAGTCTC 2166 

GTGTTCTTCT TTTAGTCTTG TATCTCTTCA CCGAATATCT ACCCATGATG CGCTATTGTT 2226 

TTATCTTCAC TTGTCTGTGT GTTTAACTGC CTTTCAATTC ACCTCATCTC ATCTCCCGCT 2286 

ACTTTCCATA TATAAAAGCA AAATTAATTT GCTTTTTCCC CTGTCAGTAT AAAAAAATTT 2346 

TCCGCAGGAT ATAGAAAAAA AAGAAATGAA ATTATAGTAG CGGTTATTTC CGTGGGGTGC 2406 

TTTTTTACAC CTGTACATCT TTTCCCTCCG TACATTTTTT TTATTTTTTT TTTGGGTTTT 2466 

TTTTTTTCGA TATTTTTCCC TCCGAAACTA GTTAGCACAA TAATGCTGAC TAAGGAAACT 2526 

TTTCATCTCA GAATTGATGG TCAGTTTGGT TTCTCTAGAG AATAGTTTAT AAAAAGATGT 2586 

TGATGTGGAG CAACCATTTA TACATCCTTT CCGCAAGTGC TTTTGGAGTG GGACTTTCAA 2646 

ACTTTAAAGT ACAGTATATC AAATAACTAA TTCAAGATGG CTAGAAGACC AGCTAGATGT 2706 

TACAGATACC AAAAGAACAA GCCTTACCCA AAGTCTAGAT ACAACAGAGC TGTTCCAGAC 2766 

TCCAAGATCA GAATCTACGA TTTGGGTAAG AAGAAGGCTA CCGTCGAT 2814 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

lie Asn Leu Lys Ala Leu Ala Ala Leu Ala Lys Lys lie Leu 
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WHAT IS CLAIMED IS: 

I. A host cell transformed or transfected with DNA 

comprising: 

a repressor gene encoding a repressor protein, said 
repressor gene under transcriptional control of a promoter; 

a selectable marker gene encoding a selectable marker 
protein; said selectable marker gene under transcriptional 
control of an operator; said operator regulated by interaction 
with said repressor protein; 

a first recombinant fusion protein gene encoding a first 
binding protein or binding fragment thereof in frame with 
either a DNA binding domain of a transcriptional activating 
protein or a transactivating domain of a transcriptional 
activating protein; and 

a second recombinant fusion protein gene encoding a 
second binding protein or binding fragment thereof in frame 
with either a DNA binding domain of a transcriptional 
activating protein or a transactivating domain of a 
transcriptional activating protein, whichever domain is not 
encoded by the first fusion protein gene, said second binding 
protein or binding fragment thereof capable of interacting with 
said first binding protein or binding fragment thereof such that 
interaction of said second binding protein or binding fragment 
thereof and said first binding protein or binding fragment 
thereof brings into proximity a DNA binding domain and a 
transactivating domain forming a functional transcriptional 
activating protein; said functional transcriptional activating 
protein acting on said promoter to increase expression of said 
repressor gene. 
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2. The host cell of claims 1 wherein said DNA binding 
domain and said transactivating domain are derived from a common 
transcriptional activating protein. 

3. The host cell of claim 1 wherein one or more of the 
repressor gene, the selectable marker gene, the first recombinant fusion 
protein gene, and the second recombinant fusion protein gene arc encoded on 
distinct DNA expression constructs. 

4. The host cell of claim 1 wherein said selectable marker 
protein is an enzyme in a pathway for synthesis of a nutritional requirement 
for said host cell such that expression of said selectable marker protein is 
required for growth of said host cell on media lacking said nutritional 
requirement. 

5. The host cell of claim 1 wherein said host cell is a yeast cell 
or a mammalian. 

6. The host cell of claim 2 wherein said selectable marker gene 

encodes HIS3; 

7. The host cell of claim 2 wherein said repressor protein gene 
encodes a tetracycline resistance protein; 

8. The host cell of claim 2 wherein said operator is a tet 

operator. 

9. The host cell of claim 2 wherein said promoter is selected 
from the group consisting of the LexA promoter, the alcohol dehydrogenase 
promoter, the Gal4 promoter. 
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10. The host cell of claim 2 wherein said DNA binding domain 
derived from a protein selected from the group consisting of LexA and Gal4. 

11. The host cells of claim 2 wherein said transactivating 
domain is derived from a protein selected from the group consisting of VP 16 
and Gal4. 



12. The host cell of claim 2 wherein the first binding protein 
is CREB and the second binding protein is CBP. 

13. The host cell of claim 2 wherein the first binding protein 
is Tax and the second binding protein is SRF. 

14. The host cell of claim 2 wherein the first binding protein 
is casein kinase I and the second binding protein is CREB. 

15. The host cell of claim 2 wherein the first binding protein 
is AKAP 79 and the second binding protein is selected from the group 
consisting of RI, RII and calcineurin. 



16. A method to identify an inhibitor of binding between a first 
binding protein or binding fragment thereof and a second binding protein or 
binding fragment thereof comprising the steps of: 

a) growing host cells of any one of claims 1 through 15 in 
the absence of a test compound and under conditions 
which permit expression of said first binding protein or 
binding fragment thereof and said second binding 
protein or binding fragment thereof such that said first 
binding protein or fragment thereof and second binding 
protein or binding fragment thereof interact bringing 
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into proximity said DNA binding domain and said 
transactivating domain forming said functional 
transcriptional activating protein; said transcriptional 
activating protein acting on said promoter to increase 
expression of said repressor protein; said repressor 
protein interacting with said operator such that said 
selectable marker protein is not expressed; 

b) confirming lack of expression of said selectable marker 
protein in said host cell; 

c) growing said host cells in the presence of a test 
compound; and 

d) comparing expression of said selectable marker protein 
in the presence and absence of said test compound 
wherein increased expression of said selectable marker 
protein is indicative that the test compound is an 
inhibitor of binding between said first binding protein or 
binding fragment thereof and said second binding 
protein or binding fragment thereof. 

17. The method of claim 16 wherein 
the host cell is a yeast cell; 
the selectable marker gene encodes HIS3; 
transcription of the selectable marker gene is regulated 
by the let operator; 

the repressor protein gene encodes the tetracycline 
resistance protein; 

transcription of the tetracycline resistance protein is 
regulated by the LexA promoter; 
the DNA binding domain is derived from LexA; and 
the transactivating domain is derived from VP] 6. 
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18. The method of claim 16 wherein 

the host cell is a yeast cell; 
the selectable marker gene encodes HIS3; 
transcription of the selectable marker gene is regulated 
by the tet operator; 

the repressor protein gene encodes the tetracycline 
resistance protein; 

transcription of the tetracycline resistance protein is 
regulated by the alcohol dehydrogenase promoter; 
the DNA binding domain is derived from LexA; and 
the transactivating domain is derived from VP16. 

19. A kit to identify an inhibitor of binding between a first 
binding protein or binding fragment thereof and a second binding protein or 
binding fragment thereof, said inhibitor identified by the method of claim 16. 



WO 98/13502 



PCT/US97/17276 



1/1 



FIGURE 1 




Tet op 



• No Growth 





A ► 




i TetK 





HIS 3 | 



Tet op 

SUBSTITUTE SHEET (RULE 26) 



•Growth 



PCT 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 
C12N 15/12, 15/10, 15/63, 
15/81, 1/19, C12Q 1/68 



A3 



(11) International Publication Number: 
(43) International Publication Date: 



WO 98/13502 

2 April 1998 (02.04.98) 



(21) International Application Number: PCT/US97/ 17276 

(22) International Filing Date: 26 September 1997 (26.09.97) 



(30) Priority Data: 
08/721,730 



27 September 1996 (27.09.96) US 



(71) Applicant: ICOS CORPORATION [US/US]; 22021 20th 

Avenue, S.E., Bothell, WA 98021 (US). 

(72) Inventors: GOODMAN, Richard, H.; 18560 Westview Drive, 

Lake Oswego, OR 97034-7382 (US). HOEKSTRA, Merl. 
F.; 10321 216th, S.E., Snohomish. WA 98290 (US). 

(74) Agent: WILLIAMS, Joseph, A., Jr.; Marshall, OToole, 
Gerstein, Murray & Borun, 6300 Sears Tower, 233 S. 
Wacker Drive, Chicago, IL 60606-6402 (US). 



(81) Designated States: AU, BR, CA, CN, CZ, FI, HU, IL, JP, MX, 
NO, PL, RU, SK, European patent (AT, BE, CH, DE, DK, 
ES, FI, FR, GB. GR, IE, IT, LU. MC, NL, PT, SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the claims 
and to be republished in the event of the receipt of amendments. 

(88) Date of publication of the international search report: 

16 July 1998(16.07.98) 



(54) Tide: METHOD TO IDENTIFY COMPOUNDS FOR DISRUPTING PROTEIN/PROTEIN INTERACTIONS 

(57) Abstract 

The present invention relates generally to materials 
and methods for identification of inhibitors of interactions 
between known binding partner proteins. 




►No Growth 



Tet op 











If Tet R J 



-=> M HIS 3 I » Growth 

Tet op 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


Fl 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


SZ 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


RJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IX 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KB 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


zw 


Zimbabwe 


C! 


Cote d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






cu 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







INTERNATIONAL SEARCH REPORT 



Intern tai Application No 

PCT/US 97/17276 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/12 C12N15/10 C12N15/63 C12N15/81 C12N1/19 
C12Q1/68 

Aocotding to International Patent Classification (IPC) or to both national olassfficatton and IPC 



B. RELOS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 

IPC 6 C12N C12Q 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Eteotronto data base consulted during the international search (name of data base and, where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


X 


WO 95 2G652 A (MEDIGENE GMBH ;ALTMANN 


l-u, 




HERBERT (DE); WENDLER WOLFGANG (DE)) 3 


16-19 




August 1995 




Y 


see the whole document 


12 


Y 


P. SUN AND R. MAURER: "An inactivating 


12 




point mutation demonstrates that 






interaction of cAMP response element 






binding protein (CREB) with the CREB 






binding protein is not sufficient for 






transcriptional activation" 






J. BIOL. CHEM., 






vol. 270, no. 13, 31 March 1995, AM. SOC. 






BIOCHEM. MOL.BIOL., INC., BALTIMORE'S, 






pages 7041-7044, XP002052726 






cited in the application 






see the whole document 











0 



Further documents are listed in the continuation of box C 



0 



Patent family members are listed in annex. 



* Special categories of cited documents : 

"A* document defining the general state of the art which is not 

considered to be of particular relevance 
*E* earlier document but published on or after the international 

filing date 

*L" document which may throw doubts on priority dalmfs) or 
which n cited to establish the publication date of another 
crtatnn or ether special reason (as specified) 

•O" document referring to an oral disclosure, use, exhibition or 



"P" document published pnar to the international fifing date but 
later than the priority date claimed 



"T" later document published after the international ftSng date 
or priority date and not in contact with the application but 
cited to understand the principle or theory underlying the 
invention 

"X* document of particular relevance; the claimed invention 
cannot ba considered novel or cannot be considered to 
involve an inventive atep when the document is taken alone 

"Y* document of particular relevance; the claimed invention 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
In the art. 

document member of the same patent family 



Date of the actual completion of the international search 

20 January 1998 


Date of mailing of the international search report 

2 7 "05- 1938 


Name and mailing address of the ISA 

European Patent Office, P .B. 5818 Patenttaan 2 
NL-2280 HV Rijswijk 
Tel. (+31-70) 340-2040, Tx. 31 651 epo nl, 
Fax: (+31-70) 340-3016 


Authorized offioer 

H0RNIG H. 



Form PC TflSA/2 10 (second sheet) (July 1992) 



page 1 of 2 



INTERNATIONAL SEARCH REPORT 



Interna 4 Application No 

PCT/US 97/17276 



^Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ° Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



p,x 



J.C. CHRIVIA ET AL. : "Phosphorylated CREB 
binds specifically to the nuclear protein 
CBP" 
NATURE, 

vol. 365, 28 October 1993, MACMILLAN 
JOURNALS LTD., LONDON, UK, 
pages 855-859, XP002O52727 
cited in the application 
see the whole document 

WO 96 03501 A (CIBA GEIGY AG ;CHAUDHURI 

BHABATOSH (CH); STEPHAN CHRISTINE (FR); F) 

8 February 1996 

cited in the application 

see the whole document 

WO 96 03499 A (HARVARD COLLEGE ;KIRSCHNER 

MARC W (US); KINOSHITA NORIYUKI (US)) 8 

February 1996 

cited in the application 

see the whole document 

WO 95 32284 A (CIBA GEIGY AG ; MATTHIAS 

PATRICK (CH); STRUBIN MICHEL (CH)) 30 
November 1995 
see the whole document 

WO 96 01313 A (BUJARD HERMANN ;G0SSEN 
MANFRED (US)) 18 January 1996 
see the whole document 



US 5 283 173 A (FIELDS STANLEY 
February 1994 
cited in the application 
see the whole document 



ET AL) 1 



SHIH H-M ET AL: "A positive genetic 
selection for disrupting protein-protein 
interactions: Identification of CREB 
mutations that prevent association with 
the coactivator CBP." 
PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES OF THE UNITED STATES OF AMERICA 
93 (24). 1996. 13896-13901. ISSN: 
0027-8424, 
16 November 1996, XP002052728 
see the whole document 



12 



1-12, 
16-19 



1-12, 
16-19 



1-12, 
16-19 



1-12, 
16-19 



1-12, 
16-19 



1-12, 
16-19 



F«m PCT/ISA210 (continuation of tecond oheet) (July 1992) 



page 2 of 2 





notional application No. 


INTERNATIONAL SEARCH REPORT 


PCT/US 97/ 17276 



Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article I7(2)(a) tor the following reasons: 
1.1 J Claims Nos.: 

1 because they relate to subject matter not required to be searched by this Authority, namely: 



2. ] I Claims Nos.: . . . 

because they relate to parts of the International Application that do not comply with the prescnbed requirements to sucn 

an extent that no meaningful International Search can be carried out, specifically: 



3 j I Claims Nos * 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box M Observations where unity of invention is lacking (Continuation of Item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 

See annex 

1 . [ 1 As ail required additional search fees were timely paid by the applicant, this International Search Report covers ail 
' ' searchable claims. 

2. Q As all searchable claims could be searched without effort justifying an additional fee, this Authority aid not invite payment 

of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
I — I covers only those claims for which fees were paid, specifically claims Nos.: 



4. fT] No required additional search fees were timely paid by the applicant Consequently, this International Search Report fc 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 

1-12, 16-19 



Remark on Protest Q] The additional search fees were accompanied by the applicant's protest. 

j~j No protest accompanied the payment of additional search fees. 



Form PCT/ISA/21 0 (continuation of first sheet (1 )) (July 1 992) 



International Application No. PCT/ US 97 / 17276 
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1. Claims: 1-12, 16-19 

A host cell transformed or transfected with DNA comprising: 
a repressor gene encoding a repressor protein, said 
repressor gene under transcriptional control of a promoter; 
a selectable maker gene encoding a selectable protein; said 
marker gene under transcriptional control of an operator; 
said operator regulated by interaction with said repressor 
protein; a first recombinant fusion protein gene encoding a 
first binding potein or binding fragment thereof in frame 
with either a transact ivator domain of a transcriptional 
activator protein; and a second recombinant fusion protein 
or binding fragment thereof in frame with either a DNA 
binding domain of a transcriptional activating protein, 
whichever domain is not encoded by the first fusion protein 
gene, said second binding protein or binding fragment 
thereof capable of interacting with said first binding 
protein or binding fragment thereof such that interaction of 
said second binding protein or binding fragment thereof 
brings into proximity a DNA binding domain and a 
transactivating domain forming a functional transcriptional 
activating protein; said functional transcriptional 
activating protein acting on said promoter to increase 
expression of said repressor gene; said DNA binding domain 
and said transactivating domain are derived from a common 
transcriptional activating protein; one or more of the 
repressor gene, the selectable marker gene, and the first 
and second recombinant fusion protein genes, are encoded on 
distinct DNA expression constructs; said host cell wherein 
said selectable marker protein is an enzyme; said host cell 
is a yeast cell or a marmialian; said selectable marker gene 
encodes HIS3, said repressor protein gene encodes a 
tetracyline resistant protein; said operator is a tet 
operator; said promoter is selcted from the group consisting 
of the LexA- , the alcohol dehydrogenase-, the 6A14-promoter; 
said DNA binding domain derived from a protein from the 
group consisting of LexA and Gal4; said transactivating 
domain is derived from a protein selected from the group 
consisting of VP16 and Gal4; said first binding protein is 
CREB and the second binding protein is CBP; a method and kit 
to identify an inhibitor of binding between a first and a 
second binding protein using said host cell. 



2. Claim : 13 

The host cell of subject one but wherein the first binding 
protein is Tax and the second binding protein is SRF. 
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3. Claim : 14 



The host cell of subject one but wherein the first binding 
protein is casein kinase I and the second binding protein is 
CREB. 



4. Claim : 15 



The host cell of subject one but wherein the first binding 
protein is AKAP 79 and the second binding protein is 
selected from the group consisting RI, RII and calcineunn. 
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